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Italy must Keep its funding pledges 


The collapse of Italy’s coalition government has left researchers vulnerable. The incoming 
administration must keep a longstanding promise to end austerity in funding. 


nationalist Lega party of deputy prime minister Matteo Salvini 

announced that it was walking away from its turbulent coalition 
with the anti-establishment M5S party, known as the Five Star Move- 
ment. The collapse is of great concern: a much-delayed funding increase 
is now on hold, and the political uncertainty adds further threat. 

What will happen nowis unclear. One of the coalition partners could 
form a government with others in parliament, or an election might be 
needed ifan agreement cannot be reached. Italy’s head of state, President 
Sergio Mattarella, will oversee the process. He needs to use his discus- 
sions with party leaders to remind them of the coalition’s promise to 
the nation’s scholars: that austerity in research funding would come to 
an end. 

The challenge for whoever takes office is that Italy’s economy has 
been mostly stagnant for a decade. It also has high levels of debt and 
could be on the brink of a recession. And as Italy, like other European 
countries, aimed to shrink its budget deficit after the 2008-09 financial 
crisis, funding for universities took a hit. 

The coalition government had promised to return funding for 
universities to 2009 levels of around €7.5 billion (US$8.3 billion). It had 
also vowed to increase a smaller fund for research institutes, known as 
the FOE, which has consistently been cut since 2013. These increases, 
although modest, would have provided welcome relief for a system in 
which most of the funding from the government is currently used to pay 
for salaries and fixed costs, such as utility bills. 

Furthermore, there is a possibility that indirect taxation — value- 
added tax (VAT) — will need to rise, from 22% to 25%. Italy has 
exceeded European Union limits on the size of its borrowing, and ifthe 
government cannot cut €23 billion from public spending, it will need 
to raise VAT. That will put even more pressure on research budgets. 

Money is not the only issue. Lega was responsible for running the 
interior ministry, and ministers clashed with scientists on the party’s 
policies towards refugees and asylum-seekers — including an indefen- 
sible law that imposes a €1-million fine on humanitarian ships patrol- 
ling the Mediterranean looking to save people in distress. Academic 
independence is also a concern. At the Ministry of Education, Univer- 
sity and Research — also the responsibility of Lega — there is evidence 
that inspectors have been monitoring the teaching of political science 
in schools. In some classes, they have been discussing whether today’s 
government policies echo Italy’s Mussolini-era past. This has unset- 
tled teachers. 

And although Italy’s spending on research and development — at 
around 1.3% ofits gross domestic product — sits well below the EU 
average of 2%, its research performance continues to improve. Between 
2000 and 2016, Italy’s share of published scientific papers increased from 
3.2% to 4% and the number of publications as a fraction of spending on 
research is comfortably above the EU average. 

In his resignation speech to Italy’s senate, prime minister Giuseppe 
Conte from the Five Star movement spoke about the need to invest more 


| ast week, Italy’s coalition government ended abruptly, when the 


in research and to establish a national agency for research — such words 
are welcome, but not enough, and he must uphold his earlier promises 
ifhis party returns to power. 

After a decade of austerity, Italy's researchers and research leaders will 
need to dig deep yet again and find ways to hold the next government 
accountable for these promises. Mattarella, a former education minister, 
can and should also play a vital supporting part. As the head of state, he 
has no executive authority, but he does have moral authority. He needs 
to use it so that promised funds and scholarly autonomy are protected 
in the next administration. = 


e@ @ 
Paying the price 
Universities must see that inadequate support 
of early-career researchers has consequences. 


eyebrows. Buta letter sent this month by the heads of the United 
Kingdomis three largest medical-research funders did just that. 

It says that some types of funding could be withheld unless uni- 
versities provide better support for early- and mid-career staff — 
particularly women and trainees. And it warns that institutions could 
be prevented from bidding for funded posts unless they change their 
ways. The letter is signed by the heads of the Medical Research Coun- 
cil, the National Institute for Health Research (NIHR) and Wellcome. 

What has sparked funder frustration is the fact that universities prom- 
ise to look after new researchers when applying for grants — making 
pledges including the provision of quality mentoring, or a path to pro- 
motion. But in some cases these commitments are ignored once grant 
money is banked — sometimes in violation of contracts. 

No institutions are named in the letter, a copy of which has been 
seen by Nature, but it points to “some very large and well-established 
Universities and Medical Schools”. 

One of the signatories — the NIHR — was an early adopter of tough 
measures in support of advancing women’s careers. In 2011, it made 
grants conditional on medical schools achieving a gold or silver in the 
Athena SWAN Charter, a scheme designed to improve women’s career 
prospects that has also raised awareness of the structural barriers to 
gender equality in universities. 

Athena SWAN has enabled many universities to take positive action 
to advance equality and diversity. But when it comes to the needs of 
early- and mid-career clinical researchers, the NIHR and the other 
medical-research funders are right to challenge universities that are not 
doing enough. A strongly worded letter warning universities that they 
could be sanctioned unless they change is a necessary step. = 


| etters from research funders to university leaders rarely raise 
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local governments in the United States to stop law-enforcement 

officers from using facial-recognition databases. The move 
followed reports that the Immigration and Customs Enforcement 
agency had been scanning millions of photos in state driver’s licence 
databases, data that could be used to target and deport undocumented 
immigrants. Researchers at Georgetown University in Washington DC 
used public-record requests to reveal this previously secret operation, 
which was running without the consent of individuals or authorization 
from state or federal lawmakers. 

It is not the only such project. Customs and Border Protection 
is using something similar at airports, creating a record of every 
passenger's departure. The technology giant Amazon is building part- 
nerships with more than 200 police departments to promote its Ring 
home-security cameras across the United States. Amazon gets ongoing 
access to video footage; police get kickbacks on 
technology products. 

Facial-recognition technology is not ready for 
this kind of deployment, nor are governments 
ready to keep it from causing harm. Stronger regu- 
latory safeguards are urgently needed, and so isa 
wider public debate about the impact it is already 
having. Comprehensive legislation must guaran- 
tee restrictions on its use, as well as transparency, 
due process and other basic rights. Until those 
safeguards are in place, we need a moratorium on 
the use of this technology in public spaces. 

There is little evidence that biometric 
technology can identify suspects quickly or in real 
time. No peer-reviewed studies have shown convincing data that the 
technology has sufficient accuracy to meet the US constitutional 
standards of due process, probable cause and equal protection that 
are required for searches and arrests. 

Even the world’s largest corporate supplier of police body cameras 
— Axon in Scottsdale, Arizona — announced this year that it would 
not deploy facial-recognition technology in any of its products because 
it was too unreliable for police work and “could exacerbate existing 
inequities in policing, for example by penalizing black or LGBTQ com- 
munities”. Three cities in the United States have banned the use of facial 
recognition by law-enforcement agencies, citing bias concerns. 

They are right to be worried. These tools generate many of the same 
biases as human law-enforcement officers, but with the false patina of 
technical neutrality. The researchers Joy Buolamwini at Massachusetts 
Institute of Technology in Cambridge and Timnit Gebru, then at Micro- 
soft Research in New York City, showed that some of the most advanced 
facial-recognition software failed to accurately identify dark-skinned 
women 35% of the time, compared to a 1% error rate for white men. Sep- 
arate work showed that these technologies mismatched 28 US members 
of Congress to a database of mugshots, with a nearly 40% error rate for 
members of colour. Researchers at the University of Essex in Colchester, 


Bist this month, Ohio became the latest of several state and 


THESE TOOLS ARE 


DANGEROUS 


WHEN THEY FAIL AND 


HARMFUL 


WHEN THEY WORK. 


Regulate facial-recognition 
technology 


Until appropriate safeguards are in place, we need a moratorium on biometric 
technology that identifies individuals, says Kate Crawford. 


UK, tested a facial-recognition technology used by London's Metropoli- 
tan Police, and found it made just 8 correct matches out ofa series of 42, 
an error rate they suspect would not be found lawful in court. Subse- 
quently, a parliamentary committee called for trials of facial-recognition 
technology to be halted until a legal framework could be established. 

But we should not imagine that the most we can hope for is 
technical parity for the surveillance armoury. Much more than techni- 
cal improvements are needed. These tools are dangerous when they fail 
and harmful when they work. We need legal guard rails for all biometric 
surveillance systems, particularly as they improve in accuracy and inva- 
siveness. Accordingly, the AI Now Institute that I co-founded at New 
York University has crafted four principles for a protective framework. 

First, given the costly errors, discrimination and privacy invasions 
associated with facial-recognition systems, policymakers should not 
fund or deploy them until they have been vetted and strong protections 
have been put in place. That includes prohibiting 
links between private and government databases. 

Second, legislation should require that public 
agencies rigorously review biometric technolo- 
gies for bias, privacy and civil-rights concerns, as 
well as solicit public input before they are used. 
Agencies that want to deploy these technologies 
should be required to carry out a formal algo- 
rithmic impact assessment (AIA). Modelled 
after impact-assessment frameworks for human 
rights, environmental protection and data protec- 
tion, AIAs help governments to evaluate artificial- 
intelligence systems and guarantee public input. 

Third, governments should require 
corporations to waive any legal restrictions on researching or over- 
seeing these systems. As we outlined in the AI Now Report 2018, 
tech companies are currently able to use trade-secrecy laws to shield 
themselves from public scrutiny. This creates a legal ‘black box’ that is 
just as opaque as any algorithmic ‘black box, and serves to shut down 
investigations into the social implications of these systems. 

Finally, we need greater whistle-blower protections for technology- 
company employees to ensure that the three other principles are work- 
ing. Tech workers themselves have emerged as a powerful force of 
accountability: for example, whistle-blowers revealed Google’s work 
on a censored search engine in China. Without greater protections, 
they are in danger of retaliation. 

Scholars have been pointing to the technical and social risks of facial 
recognition for years. Greater accuracy is not the point. We need strong 
legal safeguards that guarantee civil rights, fairness and accountability. 
Otherwise, this technology will make all of us less free. m 


Kate Crawford is a distinguished research professor and co-director 
of the AI Now Institute at New York University, and a principal 
researcher at Microsoft Research in New York City. 

Twitter: @katecrawford 
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INSTITUTIONS 
Alaska funding 
Tenured faculty members 

in the University of Alaska 
(UA) system no longer face 
the possibility of being laid 
off with 60 days’ notice. 

UAs governing board voted 
unanimously on 20 August 
to reverse its declaration of 
“financial exigency”, which 
it made in July in response 

to an unprecedented 
US$135-million cut to state 
funding for the university 
system. Financial exigency 
grants the board extraordinary 
powers to reduce costs, 
including the ability to fire 
faculty members and end 
academic programmes. But 
the budget crisis eased on 

13 August, when Alaska’s 
governor Michael Dunleavy 
and UA administrators agreed 
to a smaller, $25-million cut 
this year. The UA governing 
board will meet in early 
September to discuss how to 
distribute this year’s cut, and 
a proposal to consolidate the 
system's three main branches 
— in Anchorage, Fairbanks 
and Juneau — into one 
accredited institution. 


MIT inquiry 

The Massachusetts Institute 
of Technology (MIT) is 
launching an investigation 
into its interactions with 

sex offender and alleged sex 
trafficker Jeffrey Epstein. The 
university, in Cambridge, 
Massachusetts, received about 
US$800,000 in donations from 
the disgraced financier over 
two decades, MIT president 
Rafael Reif said on 22 August. 
All of Epstein’s donations went 
to either the MIT Media Lab 
or to physics professor Seth 
Lloyd. “In this instance, we 
made a mistake of judgment,” 
Reif said. Lloyd and MIT 
Media Lab director Joichi Ito 
have issued public apologies 
for their dealings with Epstein. 
The MIT announcement came 


3.8-million-year-old skull discovered 


Scientists have discovered a 3.8-million-year-old 
hominin skull (pictured) in Ethiopia that could 
help to clarify the origins of Lucy, our famous 
forerunner. The specimen suggests that Lucy’s 
species coexisted with an ancestor in the ancient 
Ethiopian landscape. Most researchers think that 
Lucy’s species, Australopithecus afarensis, falls on 
the same branch of the evolutionary tree as an 
earlier species called Australopithecus anamensis. 
The idea is that A. anamensis gradually 
morphed into A. afarensis, implying that the two 


species never coexisted. The skull, described 
this week in Nature, suggests otherwise. The 
fossil’s facial features indicate that it belongs to 
A. anamensis, and strengthens the case that a 
previously discovered fossil, a 3.9-million-year- 
old face fragment found in the 1980s, belongs 
to A. afarensis. This suggests that the two 
species coexisted, after all. A. afarensis may have 
evolved froma small A. anamensis group before 
gradually outcompeting the wider A. anamensis 
population. 


days after two researchers 
cut ties with the Media Lab 
because of the university's 
interactions with Epstein. 
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Polio milestone 

Polio is no longer endemic 

in Nigeria, the World Health 
Organization (WHO) said 

on 21 August, as the country 
marked three years without 
any new cases of the paralysing 
disease. Nigeria is the last 
country in Africa in which 
polio has circulated in the 
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wild; now, the entire continent 
could be declared polio-free 
next year. The WHO, private 
donors and governments 

have led a multibillion-dollar 
global campaign to eradicate 
polio. The number of new 
infections has fallen globally, 
from roughly 350,000 in 1988 
to 33 in 2018. 


UK immigration 

The UK government has said 
that freedom of movement 
as it currently stands for 
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European Union citizens will 
end as soon as the country 
leaves the bloc on 31 October. 
This means that EU scientists 
coming to work in the United 
Kingdom after this date would 
be subject to new immigration 
arrangements, which the 
government promised to 
publish “shortly” in an 
announcement on 19 August. 
The previous government's 
policy would have left the 
rights of EU citizens coming 
to study or work in the 

United Kingdom essentially 
unchanged at least until the 
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end of next year. Experts 

have questioned whether it is 
possible to implement a new 
immigration policy without a 
way of distinguishing between 
existing EU migrants, whose 
rights remain unchanged, and 
those arriving in the United 
Kingdom soon after the Brexit 
date. Science organizations 
have expressed concern at the 
move, which they say creates 
uncertainty among employers. 


Trump lawsuit 


A coalition of environmental 
groups filed a lawsuit against 
the administration of US 
President Donald Trump 

on 21 August to blocka rule 
that weakens protections 

for threatened species. The 
changes — finalized on 

12 August by the Fish and 
Wildlife Service and the 
National Marine Fisheries 
Service — affect how the 
Endangered Species Act 

is applied, and constitute 
some of the most significant 
alterations to the law since 

it was enacted in 1973. The 
revisions remove blanket 
protections for animals 

and plants that are listed as 
threatened, a category for 
organisms at risk of becoming 
endangered. The changes 
also allow federal agencies to 
conduct economic analyses 


when deciding whether to 
protect a species. 


Giraffe protections 


Nations have agreed to 
regulate trade in giraffes 
(pictured) for the first time. 
The decision — which is 
expected to be finalized this 
week — was made at a meeting 
of parties to the Convention 
on International Trade in 
Endangered Species (CITES) 
in Geneva, Switzerland. 
Nine giraffe species will be 
protected under Appendix II 
of the convention, which 
protects species that could 
have faced extinction had 
trade restrictions not been 
implemented. Countries also 
voted to protect 18 shark and 
ray species — many of which 
are hunted for their meat and 
fins — under Appendix II. 


But the parties stopped short 
of approving amendments to 
shut down all domestic ivory 
markets. 


Amazon funds 


Brazil has rejected an offer 
from the world’s seven largest 
economies (G7) to provide 
US$22 million in immediate 
funding to help put out fires 

in the Amazon (see ‘Trend 
watch’). The fund was put 
together by France's President 
Emmanuel Macron and 
pledged at the G7 annual 
meeting in Biarritz, France, 

on 26 August. After initially 
accepting the funding, the 
Brazilian government declined 
the offer. Earlier, Macron’s 
decision to put the Amazon on 
the G7 agenda angered Brazil’s 
President Jair Bolsonaro, who 
accused France of acting in a 


RECORD BURN 


SEVEN DAYS | THIS WEEK | 


colonial way. Bolsonaro said 
that he is mobilizing Brazil’s 
military to drop water on 
burning regions, and that 
Amazon countries should 

be able to deal with the issue 
without outside help. The 

G7 meeting had a strong 
focus on the environment 
and development, and also 
produced an agreement 
between the European Union, 
the G7 and international 
funding agencies to provide 
more support for the countries 
of the Sahel. 


Moon mission 

India’s Chandrayaan-2 
spacecraft entered the Moon's 
orbit on 20 August, says the 
nation’s space agency. The 
event is a milestone in the 
country’s second mission to 
the Moon: it will be its first 
attempt at a ‘soft’ landing on 
the lunar surface. Early next 
week, the lander will separate 
from the orbiter, which will 
continue to circle the Moon 
for another year. The lander, 
which carries a six-wheeled 
rover called Pragyan, is due 

to touch down near the south 
pole on 7 September. If the 
landing is successful, India’s 
will be the fourth space agency, 
after those of the United States, 
the Soviet Union and China, to 
perform a soft landing. 


The Brazilian Amazon is 


burning, and the world is taking 


notice. So far this year, more 
than 80,000 wildfires have 
burnt in Brazil — the majority 
in the Amazon — amounting 
to an increase of roughly 80% 
over the same period last year, 
according to the country’s 
National Institute for Space 
Research (INPE). 

The Amazon is the world’s 
largest rainforest and it 
contains several million plant, 
animal and insect species. 

It also acts as a huge carbon 
sink that helps to cool global 
temperatures. The wildfire 
data, which INPE released on 


20 August, have prompted an 
international outcry. In a tweet 
on 22 August, French President 
Emmanuel Macron called for 
discussion of the fires at the 
G7 summit he was hosting in 
Biarritz from 24 to 26 August. 
German Chancellor Angela 
Merkel backed Macron’s call. 
But Brazilian President Jair 
Bolsonaro hit back, tweeting 
that Macron was using the 
situation for his own political 


gain. Critics of Bolsonaro say that 


his push to make the Amazon 
more accessible to industries 
such as logging and agriculture is 
partly responsible for the rise in 
the number of fires. 


Brazil’s space-agency satellites have detected hotspots — fires with a 
front at least 30 metres long — in record numbers this year. Most fires 
are burning in the country’s Amazon rainforest. 

— 2019 

— 2018 


Fires (thousands) 


The start of the 
dry season in 
Brazil’s Amazon. 


Jan Feb Mar Apr May Jun Jul Aug* 


*Data current as of 26 August. 
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The University of Adelaide suspended Alan Cooper as leader of the prestigious Australian Centre for DNA following an investigation. 


RESEARCH CULTURE 


Anxiety mixed with great 
science in troubled DNA lab 


Researchers say lab leader Alan Cooper, who was suspended last week, bullied them. 


BY DYANI LEWIS 


Ts University of Adelaide has 
suspended the leader of its ancient- 
DNA centre, Alan Cooper, following an 
investigation into the ‘culture’ at the Austral- 
ian laboratory. The university has not given a 
reason for its decision, but current and former 
co-workers of Cooper — an award-winning 
evolutionary biologist who specializes in 
human migration — have told Nature that he 
bullied them and others. 


Their accounts paint a picture of a lab that 
was exciting scientifically — but that had a 
toxic work environment. Former student 
Nic Rawlence says he was bullied while at the 
Australian Centre for Ancient DNA (ACAD) 
and developed stress-induced health issues. 
Another former student, Dean Male, says he 
left the lab as a result of Cooper's bullying. “I 
couldn't get out of there fast enough,” he says. 

Nature interviewed nine of Cooper’s current 
and former co-workers. Four — including one 
current team member — say that he bullied 


them; four more, two of whom still work at 
the centre, say that they observed him bul- 
lying team members. Most of those people 
requested anonymity for fear of damaging 
their academic careers. Three of those who 
allege that Cooper bullied them gave evidence 
to the investigation, as did two of those who 
say they observed it. 

Another former colleague, Paul Brotherton, 
told Nature that although Cooper is brash, 
he is not a bully. Cooper could be disdainful 
towards someone and their work if it wasn't > 
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> going to lead to a high-profile publication, 
he says. “Perhaps he’s not very good at disguis- 
ing his impatience and his lack of interest.” 
Some of the people Nature spoke to say they 
had complained before but that things did not 
change. Others say they did not 
make formal complaints for fear 
that Cooper would find out and the 
bullying would get worse. 
Rawlence says he’s “cautiously 
optimistic” that the university's 
decision to suspend one of its most 
prominent scientists is a sign that 
the allegations against Cooper are 
being taken seriously. But others 
are sceptical that the university will 
take further action or that the situ- 
ation will improve, citing the fund- 
ing that Cooper brings in, and the 
fact that previous complaints seem 
to have had little effect. In 2016, 
Cooper was named South Austral- 
ian Scientist of the Year. He has also 
been awarded millions of dollars in 
highly competitive grants from the 
Australian Research Council. 
Several of the researchers say that 
the university should permanently 
remove Cooper as leader of ACAD, 
which has about 36 staff and stu- 
dents, according to its website. “He 
is just going to tear up lives as long as he’s in 
that role,” says one former student. 
At the time of publication, Cooper had not 
responded to Nature’s request for comment. 
Cooper is a pioneer of ancient-DNA 
research, and his work to improve extraction 
techniques in the mid-1990s transformed the 
field. In 2001, he sequenced the first full mito- 
chondrial genome from an extinct animal, two 
species of the New Zealand moa (Emeus crassus 
and Dinornis giganteus; A. Cooper et al. Nature 
409, 704-707; 2001). He has also characterized 
plaque on ancient teeth to understand changes 
in early-human diet across Europe (C. J. Adler 
et al. Nature Genet. 45, 450-455; 2013). A 
project he leads to sequence the genomes of 
Indigenous Australian groups was awarded a 
prestigious Australian Museum Eureka Prize 
in 2017. 


NIGHTMARE LAB 
Cooper’s suspension comes after the university 
engaged SAE Consulting in Adelaide to con- 
duct a ‘culture check of ACAD in July. Cooper 
was not named as a focus of the probe, and the 
university did not say what prompted it, but 
on 19 August, ACAD students and staff were 
notified of Cooper's suspension. “Following on 
from the information provided, the University 
has decided to take further action,’ a spokes- 
person for the university told Nature. Cooper 
will remain suspended pending “the outcome 
of further processes’, the statement read. 
Rawlence was at ACAD from 2006 to 2013 
and gave evidence to the investigation. He 
says Cooper would yell at him, sometimes 


in front of colleagues, and criticize his work. 
“It was pretty much an everyday occurrence,” 
says Rawlence, who now leads a lab at the 
University of Otago in Dunedin, New Zealand. 

Male, who was a senior researcher at ACAD 


Nic Rawlence alleges that Alan Cooper bullied him at ACAD. 


from 2006 to 2007 and did not give evidence 
to the investigation, says his experience of 
working in the world-class lab was marred by 
Cooper's bullying. “It was fantastic science, 
really breathtaking, cutting-edge stuff” he said. 
Cooper often targeted the most vulnerable 
people in the lab, according to Male, who still 
works in research but has left academia. 
Male recalls hearing Cooper’s shouting 
from behind his closed office door, and was 
himself yelled at 


“It was fe antastic several times. “Hed 
science, really kind of stalk and 
breathtaking, walk a bit, warming 


cutting -edge up, and then the door 
stuff.” would close and hed 
be behind you and it 
was actually quite intimidating, and then the 
shouting and yelling would start,” he says. 

Cooper's criticisms of students’ work were 
unconstructive and tinged with personal 
insults, according to a former ACAD student 
who witnessed Cooper bullying others. “It 
borders on cruel because it’s just so relentless 
and not everyone is subjected to it,” they say. 

The current ACAD student who accuses 
Cooper of bullying them and who gave evidence 
to the investigation told Nature in an e-mail that 
they were surprised when they came out of a 
meeting unscathed. “I was frequently paralysed 
by anxiety and feelings of inadequacy:’ 

Some students say Cooper took an unusu- 
ally long time to read their papers and theses — 
sometimes several months — and was slow to 
sign paperwork that allowed them to graduate. 

Rawlence says he had to lodge a 
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formal complaint to the then-dean of graduate 
studies, Richard Russell, to get Cooper to read 
his PhD thesis so that he could complete his 
studies. Rawlence says Cooper then complied. 

Rawlence and another former student 
who alleges they were bullied 
say they told their postgraduate 
coordinator about Cooper, and 
were informed that the univer- 
sity was aware of problems with 
his behaviour. They also say they 
complained to the university’s 
management. The university did 
not indicate to them whether any 
steps had been taken to address 
the grievances, they say. 

Another former student says 
they left without completing their 
studies partly owing to Cooper's 
behaviour. 

But Brotherton, who worked as 
a postdoc with Cooper at the Uni- 
versity of Oxford, UK, and later at 
ACAD, doesn't think Cooper is a 
bully. In his opinion, many of the 
alleged incidents are about per- 
sonality differences. “[Alan] won't 
win empathetic boss of the year 
competition, but he’s not a sav- 
age bully,’ says Brotherton, who 
no longer works in academia. He 
does say, however, that Cooper can be “quite 
abrasive and in-your-face’, and that behav- 
iours such as taking less interest in some 
people's projects are sins of “omission rather 
than commission”. 


AIRING GRIEVANCES 

Most of the people whom Nature interviewed 
say that they were relieved when the university 
launched the culture check. But some have also 
questioned whether the scope of the investi- 
gation was too narrow. Rawlence and several 
other former students say that, initially, only 
current students were asked to participate. 

Rawlence ended up participating only 
because colleagues currently at the centre 
alerted some former students to the probe, 
which prompted him and some others, he says, 
to contact the consultant leading the investi- 
gation, SAE Consulting’s Sophie Rayner. But 
because the university didnt initially approach 
former students, some of the students worry 
that the probe might have missed accounts 
from past members of the lab. 

Others complain that they could not give 
anonymous accounts to the investigation. 
One former student says Rayner told them 
that the university did not want anonymous 
accounts, and so decided against giving their 
account of witnessing bullying behaviour. 
SAE Consulting principal Sallie Emmett 
says the firm does not comment on matters 
relating to clients. 

The university declined to comment when 
asked about the investigation and its handling 
of previous complaints against Cooper. m 
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Huge US government study 
will offer genetic counselling 


The National Institutes of Health has hired a firm to help participants cope with results. 


BY JONATHAN LAMBERT 


US government project that aims to 
Awe the genomes of one million 
volunteers will partner with a genetic- 
counselling company to help participants 
understand their results. It will be the largest 
US government study to provide sucha service. 
The National Institutes of Health (NIH) in 
Bethesda, Maryland, is leading the project, 
called All of Us. And on 21 August, the agency 
announced the award of a US$4.6-million, 
5-year grant to Color. 

The firm, in Burlingame, California, will 
counsel every study participant with a genetic 
variant that could have serious health implica- 
tions — such as BRCA mutations associated 
with breast cancer — when they receive their 
results. Color will also develop educational 
materials for all study participants, and will 
offer telephone consultations to those who wish 
to discuss their results with a counsellor. 

“This is a really responsible and more 
equitable way of communicating the results 
of research to all participants,” says Bartha 
Knoppers, the director of the Centre of 
Genomics and Policy at McGill University in 
Montreal, Canada. “They’re laying the foun- 
dations for building good bridges between the 
findings and the people.” 

The All of Us study, which launched in 
May 2018, aims to enrol at least one million 
people. Participants will be asked to provide 
a host of health information, including elec- 
tronic health records, genomic data and blood 
and urine samples. Study researchers also plan 
to collect data recorded by personal activity 
trackers, such as those found on smartphones. 
They will store the information in an online 
database that outside scientists can access with 
permission from the programme. 

Enrolling participants from ethnic and 
socio-economic groups that are typically under- 
represented in biomedical research is a prior- 
ity for the study’s organizers. Most genomic 
research until now has been conducted on 
non-Hispanic white people. One recent review 
found that as of 2018, 78% of people included 
in genomic studies of disease were of European 
descent (G. Sirugo et al. Cell 177, 26-31; 2019). 
That bias narrows the applicability of conclu- 
sions from genetic-testing studies, and can lead 
to misleading or dangerous interpretations of 
genetic variants found in other populations. 


Researchers running a genetic-sequencing project in the United States aim to recruit one million people. 


The All of Us study has enrolled 175,000 peo- 
ple around the United States so far. About 50% 
are people of colour, and 80% are from groups 
that have historically been under-represented 
in biomedical research. The study’s scientists 
have yet to sequence any genomes, but they 
hope to provide participants with results in the 
first half of 2020, says Stephanie Devaney, the 
deputy director of All of Us. 

To generate the kind of long-term data set 
necessary for breakthroughs in precision medi- 
cine — which uses genomic, physiological and 
other data to tailor treatments to individuals 
— All of Us must retain these participants, ide- 
ally throughout their lives. That’s where genetic 
counselling comes in. 

“It's imperative to our mission that we return 
value to our participants, that we communicate 
back the results of [our] research,’ says Devaney. 


WORKING OUT THE DETAILS 

This is a step in the right direction, says Amy 
McGuire, a bioethicist at Baylor College of 
Medicine in Houston, Texas. But “the devil is 
in the details’, she adds. 

And Devaney and her colleagues need to 
work out a lot of details — including what the 
programme will tell participants about their 
own genomes, and how. A genetic counsellor 
will give people information on genetic vari- 
ants that have clear, actionable consequences 
for health, such as those in the BRCA gene. But 


study organizers are still discussing how much 
to tell participants about genetic variants that 
don’t have such an explicit link to illness. 

Their task is complicated by the fact that 
knowledge about genetic variants can change 
over time. A mutation that researchers now 
think is benign could one day be considered 
an indication of increased cancer risk. All of 
Us participants are told that the implications 
of their genetic-test results could change as 
scientists learn more about certain mutations, 
says Brad Ozenberger, genomics programme 
director at All of Us. But he and his colleagues 
are still working out how frequently to notify 
participants of such developments. 

The effects of a genetic variant can also 
depend on ethnicity. Certain genetic tests 
that physicians use to help determine whether 
someone with cancer should undergo chemo- 
therapy have been tested only in white Europe- 
ans. It’s unclear whether these are accurate for 
people of colour. All of Us and Color say that 
they are working out the best way to commu- 
nicate such uncertainties to study participants. 

But the company says that it’s prepared 
to have those conversations. “We've worked 
with a lot of diverse communities,” says Alicia 
Zhou, vice-president of research and scientific 
affairs at Color. These include technology and 
manufacturing companies, railway workers in 
Alaska and residents of Trinidad and Tobago, 
she adds. = 
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CRISPR turns gels into 
biological watchdogs 


Gene-editing tool used to trigger smart materials that can 
deliver drugs and sense biological signals. 


BY EWEN CALLAWAY 


Scientists have wielded the gene-editing 

tool to make scores of genetically modified 
organisms, as well as to track animal develop- 
ment, detect diseases and control pests. Now, 
they have found yet another application for it: 
using CRISPR to create smart materials that 
change their form on command. 

The shape-shifting materials could be used 
to deliver small molecules, and to create senti- 
nels for almost any biological signal, research- 
ers reported on 22 August (M. A. English et al. 
Science 365, 780-785; 2019). The study was led 
by James Collins, a bioengineer at the Massa- 
chusetts Institute of Technology in Cambridge. 

Collins’s team worked with water-filled 
polymers that are held together by strands of 
DNA, known as DNA hydrogels. To alter the 
properties of these materials, Collins and his 
team turned to a form of CRISPR that uses a 
DNA-snipping enzyme called Cas1 2a. (The 
gene editor CRISPR-Cas9 uses the Cas9 
enzyme to snip a DNA sequence at the desired 
point.) The Cas12a enzyme can be programmed 
to recognize a specific DNA sequence. The 
enzyme cuts its target DNA strand, then severs 
single strands of DNA nearby. 

This property allowed the researchers to 
build a series of CRISPR-controlled hydrogels 


I: there anything CRISPR can't do? 


containing a target DNA sequence and single 
strands of DNA, which break up after Cas12a 
recognizes the target sequence in a stimu- 
lus. The break-up of the single DNA strands 
triggers the hydrogels to change shape or, in 
some cases, completely dissolve, releasing a 
payload (see ‘CRISPR-controlled gel’). 


SMART OBJECTIVES 

The team created hydrogels programmed to 
release enzymes, small molecules and even 
human cells — for instance, as part ofa therapy 
— in response to stimuli. Collins hopes that the 


CRISPR-CONTROLLED GEL 


gels could be used to make smart therapeutics 
that release, for example, cancer drugs in the 
presence ofa tumour, or antibiotics around an 
infection. 

The researchers also integrated CRISPR-con- 
trolled hydrogels into electronic circuits. In one 
approach, they placed hydrogels inside a small 
chip-like device called a microfluidic chamber 
that was linked to an electronic circuit. The cir- 
cuit switched off in response to the detection of 
genetic material from pathogens including the 
Ebola virus and methicillin-resistant Staphylo- 
coccus aureus (MRSA). The team even used the 
hydrogels to develop a prototype diagnostic tool 
that sends a wireless signal when it recognizes 
genetic material from Ebola in lab samples. 

Dan Luo, a bioengineer at Cornell Univer- 
sity in Ithaca, New York, says that the CRISPR 
hydrogels are an improvement on other 
responsive hydrogels because scientists can 
easily determine what triggers a change in the 
material. 

“We're in the CRISPR age right now,’ Collins 
says. “It’s taken over biology and biotechnol- 
ogy. We've shown that it can make inroads into 
materials and bio-materials.” = 


Researchers have created a smart hydrogel material that is held together by DNA. The CRISPR-Cas12a 
protein cuts the DNA strands, changing the gel’s shape, which can be controlled to release drugs, particles or 
even switch an electronic circuit. 


Cas12a enzyme 


cuts DNA strands 


Cargo vessels 


Release 


Geologist’s sacking 
prompts outcry 


Tenured professor dismissed from University of Copenhagen. 


BY QUIRIN SCHIERMEIER 


or the second time in three years, 
f geoscientists are protesting against the 
dismissal ofa geologist from the Univer- 

sity of Copenhagen. 
The management of the university’s 
science faculty dismissed Irina Artemieva, 


a tenured professor and internationally 
esteemed specialist in lithospheric geophysics, 
on 29 July — saying that she has repeat- 
edly failed to fulfil various administrative 
and teaching duties. They allege that she 
has failed to use the appropriate calendar to 
plan holidays; travelled to conferences with- 
out approval; and caused inconvenience to 
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examination and teaching schedules. “Your 
actions and behaviour have had a negative 
impact on the performance of your duties 
relating to teaching and research activities in 
overall terms,’ the faculty told Artemieva in 
the July letter informing her of her dismissal. 
Artemieva denies the accusations, and 
defended herself in a 128-page document sent 
to the faculty of science after the management 
informed her in May that it was contemplat- 
ing her dismissal. She says that all her external 
work activities, including field trips, conference 
attendance and editorial work, are standard 
professional undertakings that she has docu- 
mented as required by the university's rules. 
An international group of 32 geoscientists 
says that the university's action is problematic 
because the reasons given do not warrant the 
dismissal of a tenured professor, by interna- 
tional academic standards. This — combined 


IRINA ARTEMIEVA 


with the similar dismissal of another 
geologist three years ago from the 
same faculty, which geoscientists 
also protested about — threatens 
the reputation of the University 
of Copenhagen and the Danish 
university system, they say in a July 
letter sent to the university after 
it had told Artemieva that it was 
considering her dismissal. 

In 2016, the faculty’s management 
sacked Hans Thybo, a prominent 
geologist who was, at the time, pres- 
ident of the European Geosciences 
Union, over his use of a private 
e-mail account for work purposes. 
A group of geoscientists similarly 
criticized that sacking, and urged the univer- 
sity to reconsider its decision. Thybo, now a 
researcher at the University of Oslo, appealed 
against the sacking, and received a settlement 
of six months’ salary after arbitration discus- 
sions between the university and a trade union 
representing academic employees — but he 
was not reinstated to his post. 


PERSONAL DISAGREEMENTS 

“Throughout most of the developed world, 
a tenured professor can only be dismissed 
for gross misconduct or criminal activity,” 
the group of geoscientists wrote. “Professor 
Artemieva’s dismissal appears to be based 
on personal disagreements between her and 


Irina Artemieva is a specialist in lithospheric geophysics. 


the management of the department,” the 
scientists wrote. “At least on these occasions, 
the University of Copenhagen is not adher- 
ing to the international standards of academic 
freedom and the rights of its employees.” 

“This new dismissal will damage the 
reputation of the university system and the 
country’s scientific community even more 
than the earlier case,” they wrote. 

“Trina is an outstanding researcher, adviser 
and geoscience community member,” says 
Seth Stein, an Earth scientist at Northwestern 
University in Evanston, Illinois, who organized 
the protest letter to the university. “Losing her 
would be a great loss to the geophysics pro- 
gramme at the University of Copenhagen” 


IN FOCUS | NEWS 


The University of Copenhagen 
declined Nature’s request for com- 
ment on the dismissal, saying that it 
does not discuss matters concerning 
individual employees. The Danish 
ministry for science and education 
also declined to comment on the case, 
or on the suggestion that the dismissal 
would harm Danish universities’ 
reputations. 

Artemieva says that her treatment 
has amounted to discrimination — 
complaints that the university says, in 
its letters to her, are unsubstantiated. 
The researcher, who is originally from 
Russia and was the only female pro- 
fessor in her department, says that she 
was consistently made to feel unwelcome after 
gaining her tenured position through an open 
call for applications. “No matter what I would 
do, I was facing professional enmity here from 
the very start,” she says. 

In Artemieva’s dismissal letter, the depart- 
ment’s dean, John Renner Hansen, says that 
the faculty of science “does not recognize the 
picture of [Artemieva] having been exposed 
to ‘harassment, ‘bullying and ‘discrimination’ 
since you were appointed professor” It adds: 
“Your actions have been confrontational and 
conflict-escalating ... Rather than responding 
to the critique raised, you continue to make 
accusations against different management 
members.” = 


Brazil budget cuts threaten 
80,000 science scholarships 


The country’s main research-funding agency could stop payments as soon as September. 


BY RODRIGO DE OLIVEIRA ANDRADE 


razil’s main science-funding agency will 
have to suspend more than 80,000 schol- 


arships to postdoctoral researchers and 
graduate and undergraduate students starting 
in September unless it receives additional cash 
from the government. 

The National Council for Scientific 
and Technological Development (CNPq) 
announced the impending cancellations on 
15 August. The CNPq also won't be offer- 
ing new scholarships, according to the state- 
ment. Brazil’s government hasn't released the 
330 million reais (US$89 million) that it froze 
in the CNPq’s budget as part of broader spend- 
ing cuts in March. If President Jair Bolsonaro’s 
administration doesn't release some of the 


money soon, the CNPq’ scholarship fund will 
run out of cash by next month. 

“Government is jeopardizing the future of a 
whole generation of Brazilian scientists,” says 
Paulo Artaxo, a physicist at the University of 
Sao Paulo. Cancelling the scholarships will 
have a devastating impact on Brazilian science, 
which depends on these young researchers, 
he says. 

Not supporting students in research pro- 
grammes “is like shooting oneself in the foot’, 
says Alexander Turra, an oceanographer at the 
Oceanographic Institute of the University of 
Sao Paulo. 


A MATTER OF SURVIVAL 
Biologist Nicole Malinconico is one of many 
graduate students who might have to leave 


research if the CNPq scholarships fall through. 
She moved to Sao Paulo in January and has 
applied to the doctorate programme at the 
Oceanographic Institute. 

“Now, even if I enter the doctorate 
[programme], without the scholarship I won't 
be able to keep myself in Sao Paulo,” says 
Malinconico. She plans to apply for a schol- 
arship offered by the Sao Paulo Research 
Foundation, a local science-funding agency. 
But the competition for alternative sources of 
money has grown stiff, she says. Malinconico 
fears that she will have to give up her research 
career to look for a job outside academia, as 
many of her friends are doing. 

“For many students, a scholarship is much 
more than research support, it is a salary that 
they use to live, to eat and to pay their bills,” > 
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Students in Brazil’s capital protested against cuts to education and science funding earlier this year. 


says Daniel Martins-de-Souza, a biochemist 
at the University of Campinas in Brazil. 
Without that support, lots of researchers will 
be out of work, which could shift Brazil’s over- 
all unemployment figures, he says. 

The Brazilian Society for the Advancement 
of Science, based in Sao Paulo, along with 
97 other research and academic institu- 
tions in the country, launched an online 
petition on 13 August demanding that the 


government help the CNPq meet its funding 
commitments. As of 27 August, it has more 
than 900,000 signatures. 


GOING BACKWARDS 

Researchers in Brazil have been working 
under a cloud of uncertainty since March, 
when Bolsonaro’s administration announced 
that it would freeze 42% of the budget of 
the science and communications ministry 


(MCTIC). This included the freeze in the z 
budget of the CNPq, which is an agency ¢ 
within the MCTIC. Around that time, the £ 
government also announced that it would & 
cut 30% of the funds that it gives to federal 
universities. 

Many researchers left Brazil for better 
situations abroad, and those who stayed 
have struggled to keep their laboratories 
functioning. 

“Science is walking backwards in Brazil,” 
says Marcos Buckeridge, the director of the 
National Institute of Bioethanol Science and 
Technology. 

The institute includes 31 laboratories in 
5 Brazilian states that develop technology 
to produce biofuels using materials such as 
plants or animal waste. Buckeridge fears that 
if the CNPq stops funding student and post- 
doctoral scholarships, in the next few months 
the institute won't have enough researchers 
to run experiments. 

The CNPq and the MCTIC are in nego- 
tiations with the Ministry of Economy for 
more money by the end of the year so that 
the agency can support scholarships, says 
CNPq spokesperson Mariana Galiza de 
Oliveira. But it’s unclear whether the agency 
will receive the money in time to avoid an 
interruption to payments for current scholar- 
ship holders, she says. m 
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POLICING SELF-CITATIONS 


Some top academics cite themselves heavily, and 
researchers are debating what to do about it. 


BY RICHARD VAN NOORDEN AND DALMEET SINGH CHAWLA 


he world’s most-cited researchers, 


most-cited 100,000 or so researchers over 


according to newly released data, COUNTRY BY COUNTRY the past 2 decades across 176 scientific 
area curiously eclectic bunch. [MgusielsibUssiaicnch’ bralneinavey iedbleli Suse lischauon lates sub-fields. He compiled the data together 
Nobel laureates and eminent polymaths (0) ac with Richard Klavans and Kevin Boyack 
rub shoulders with less familiar names, Ukraine at analytics firm SciTech Strategies in 


such as Sundarapandian Vaidyanathan 
from Chennai in India. What leaps out 
about Vaidyanathan and hundreds of 
other researchers is that many of the 
citations to their work come from their 
own papers, or those of their co-authors. 

Vaidyanathan, a computer scientist 
at the Vel Tech R&D Institute of Tech- 
nology, a privately run institute, is an 
extreme example: he has received 94% 
of his citations from himself or his co- 
authors up to 2017, according to a study 
in PLoS Biology this month’. He is not 
alone. The data set, which lists around 
100,000 researchers, shows that at least 
250 scientists have amassed more than 
50% of their citations from themselves 
or their co-authors, whereas the median 
self-citation rate is 12.7%. 

The study could help to flag potential 
extreme self-promoters, and possibly 
‘citation farms’, in which clusters of 
scientists massively cite each other, say the 
researchers. “I think that self-citation farms are 
far more common than we believe,’ says John 
Ioannidis, a physician at Stanford University 
in California who led the work and specializes 
in meta-science — the study of how science is 
done. “Those with greater than 25% self-citation 
are not necessarily engaging in unethical behav- 
iour, but closer scrutiny may be needed,’ he says. 

The data are by far the largest collection of 
self-citation metrics ever published. And they 
arrive at a time when funding agencies, journals 
and others are focusing more on the potential 
problems caused by excessive self-citation. 
In July, the Committee on Publication Ethics 
(COPE), a publisher-advisory body in London, 
highlighted extreme self-citation as one of the 
main forms of citation manipulation. This issue 
fits into broader concerns about an over-reli- 
ance on citation metrics when making decisions 
about hiring, promotions and research funding. 

“When we link professional advance- 
ment and pay attention too strongly to 
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Albuquerque, New Mexico, and Jeroen 
Baas, director of analytics at the Amster- 


oo dam-based publisher Elsevier; the data 


all come from Elsevier’s proprietary 
Scopus database. The team hopes that 
its work will make it possible to identify 
factors that might be driving citations. 
But the most eye-catching part of the 
data set is the self-citation metrics. It is 
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citation-based metrics, we incentivize self- 
citation,” says psychologist Sanjay Srivastava 
at the University of Oregon in Eugene. 
Although many scientists agree that 
excessive self-citation is a problem, there is 
little consensus on how much is too much 
or on what to do about the issue. In part, 
this is because researchers have many legiti- 
mate reasons to cite their own work or that of 
colleagues. Ioannidis cautions that his study 
should not lead to the vilification of particu- 
lar researchers for their self-citation rates, not 
least because these can vary between disci- 
plines and career stages. “It just offers com- 
plete, transparent information. It should not be 
used for verdicts such as deciding that too high 
self-citation equates to a bad scientist,” he says. 


DATA DRIVE 

Ioannidis and his co-authors didn't publish 
their data to focus on self-citation. That's just 
one part of their study, which includes a host 
of standardized citation-based metrics for the 
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Median self-citation rate 


2.0 


already possible to see how many times 
an author has cited their own work 
by looking up their citation record in 
subscription databases such as Scopus 
and Web of Science. But without a view 
across research fields and career stages, 
it’s difficult to put these figures into 
context. 

Vaidyanathan’s record stands out as 
one of the most extreme — and it has 
brought certain rewards. Last year, he 
won a 20,000-rupee (US$280) award for 
being among the nation’s top research- 
ers by measures of productivity and citation 
metrics. Vaidyanathan did not reply to Nature’s 
request for comment, but he has previously 
defended his citation record in reply to ques- 
tions about Vel Tech posted on Quora, the 
online question-and-answer platform. In 2017, 
he wrote that because research is a continuous 
process, “the next work cannot be carried on 
without referring to previous work’, and that 
self-citing wasn't done with the intention of 
misleading others. 

Two other researchers who have gained 
plaudits and cite themselves heavily are 
Theodore Simos, a mathematician whose web- 
site lists affiliations at King Saud University in 
Riyadh, Ural Federal University in Yekaterin- 
burg, Russia, and the Democritus University 
of Thrace in Komotini, Greece; and Claudiu 
Supuran, a chemist at the University of Florence, 
Italy, who also lists an affiliation at King Saud 
University. Both Simos, who amassed around 
76% of his citations from himself or his 
co-authors, and Supuran (62%) were last year 
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named on a list of 6,000 “world-class 
researchers selected for their excep- 
tional research performance” produced 
by Clarivate Analytics, an information- 
services firm in Philadelphia, Pennsylva- 
nia, which owns Web of Science. Neither 
Simos nor Supuran replied to Nature’s 
requests for comment; Clarivate said that 
it was aware of the issue of unusual self- 
citation patterns and that the methodol- 
ogy used to calculate its list might change. 


WHAT TO DO ABOUT SELF-CITATIONS? 

In the past few years, researchers have 
been paying closer attention to self- 
citation. A 2016 preprint, for instance, 
suggested that male academics cite their 
own papers, on average, 56% more than 
female academics do’, although a repli- 
cation analysis last year suggested that 
this might be an effect of higher self- 
citation among productive authors of 
any gender, who have more past work to 
cite’. In 2017, a study showed that sci- 
entists in Italy began citing themselves 
more heavily after a controversial 2010 
policy was introduced that required 
academics to meet productivity thresholds 
to be eligible for promotion’. And last year, 
Indonesia's research ministry, which uses a 
citation-based formula to allocate funding for 
research and scholarship, said some researchers 
had gamed their scores using unethical prac- 
tices, including excessive self-citations and 
groups of academics citing each other. The 
ministry said that it had stopped funding 
15 researchers and planned to exclude self- 
citations from its formula, although researchers 
tell Nature that this hasn't yet happened. 

But the idea of publicly listing individuals’ 
self-citation rates, or evaluating them on the 
basis of metrics corrected for self-citation, 
is highly contentious. For instance, in a dis- 
cussion document issued last month*, COPE 
argued against excluding self-citations from 
metrics because, it said, this “doesn't permit a 
nuanced understanding of when self-citation 
makes good scholarly sense”. (See go.nature. 
com/2z3uomu for a survey.) 

In 2017, Justin Flatt, a biologist then at the 
University of Zurich in Switzerland, called for 
more clarity around scientists’ self-citation 
records’. Flatt, who is now at the University of 
Helsinki, suggested publishing a self-citation 
index, or s-index, along the lines of the h-index 
productivity indicator used by many research- 
ers. An h-index of 20 indicates that a researcher 
has published 20 papers with at least 20 cita- 
tions; likewise, an s-index of 10 would mean a 
researcher had published 10 papers that had 
each received at least 10 self-citations. 

Flatt, who has received a grant to collate data 
for the s-index, agrees with Ioannidis that the 
focus of this kind of work shouldn't be about 
establishing thresholds for acceptable scores, 
or shaming high self-citers. “It's never been 
about criminalizing self-citations,” he says. 
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Because particle physics and astrophysics have big consortia that 
publish multi-authored papers which cite each other, they have the 
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as nuclear and particle physics, and 
astronomy and astrophysics — owing 
to their many multi-authored papers 
(see ‘Physics envy?’). Baas says he has 
no plans to publish his data set, however. 


NOT GOOD FOR SCIENCE? 

Although the PLoS Biology study 
identifies some extreme self-citers 
and suggests ways to look for others, 
some researchers say they aren't con- 
vinced that the self-citation data set 
will be helpful, in part because this 
metric varies so much by research dis- 
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set, divided into 176 fields. 
+Co-author self-citation: self-citations to a paper by any co-author are counted as 
self-citations in each co-author’s record. 


But as long as academics continue to promote 
themselves using the h-index, there’ a case for 
including the s-index for context, he argues. 


CONTEXT MATTERS 

An unusual feature of Ioannidis’s study is its 
wide definition of self-citation, which includes 
citations by co-authors. This is intended to 
catch possible instances of citation farming; 
however, it does inflate self-citation scores, 
says Marco Seeber, a sociologist at Ghent 
University in Belgium. Particle physics and 
astronomy, for example, often have papers with 
hundreds or even thousands of co-authors, and 
that raises the self-citation average across the 
field. 

Ioannidis says that it’s possible to account 
for some systematic differences by comparing 
researchers with the average for their country, 
career stage and discipline. But more generally, 
he says, the list is drawing attention to cases 
that deserve a closer look. In unpublished 
work, Elsevier's Baas says that he has applied 
a similar analysis to a much larger data set of 
7 million scientists: that is, all authors listed 
in Scopus who have published more than 
5 papers. In this data set, Baas says, the median 
self-citation rate is 15.5%, but as many as 7% of 
authors have rates above 40%. This proportion 
is much higher than among the top-cited 
scientists, because many of the 7 million 
researchers have only a few citations overall 
or are at the start of their careers. Early-career 
scientists tend to have higher self-citation rates 
because their papers haven't had time to amass 
many citations from others. 

According to Baas’s data, Russia and Ukraine 
stand out as having high median self-citation 
rates (see ‘Country by country’). His analysis 
also shows that some fields stick out — such 
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cipline and career stage. “Self-citation 
is much more complex than it seems,” 
says Vincent Lariviere, an information 
scientist at the University of Montreal 
in Canada. 

Srivastava adds that the best way 
to tackle excessive self-citing — and 
other gaming of citation-based indi- 
cators — isn't necessarily to publish 
ever-more-detailed metrics to compare 
researchers against each other. These 
might have their own flaws, he says, and 
such an approach risks sucking scien- 
tists even further into a world of evaluation by 
individual-level metrics, the very problem that 
incentivizes gaming in the first place. 

“We should ask editors and reviewers to 
look out for unjustified self-citations,” says 
Srivastava. “And maybe some of these rough 
metrics have utility as a flag of where to look 
more closely. But, ultimately, the solution 
needs to be to realign professional evalua- 
tion with expert peer judgement, not to dou- 
ble down on metrics.” Cassidy Sugimoto, an 
information scientist at Indiana University 
Bloomington, agrees that more metrics might 
not be the answer: “Ranking scientists is not 
good for science.” 

Ioannidis, however, says his work is needed. 
“People already rely heavily on individual- 
level metrics anyhow. The question is how to 
make sure that the information is as accurate 
and as carefully, systematically compiled as 
possible,” he says. “Citation metrics cannot 
and should not disappear. We should make the 
best use of them, fully acknowledging their 
many limitations.” = 


Richard Van Noorden is a features editor 
with Nature in London. Dalmeet Singh 
Chawla is a freelance science journalist in 
London. 
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Use ancient remains 
more wisely 


Researchers rushing to apply powerful sequencing techniques to ancient-human 
remains must think harder about safeguarding, urge Keolu Fox and John Hawks. 


he study of ancient-human 
populations and our now-extinct 
close relatives has thrived over the 
past decade, as genetic material is exam- 
ined with cheaper and more sophisticated 
sequencing technologies. Only nine years 
ago, the partial sequencing ofa Neanderthal 
genome was a major scientific achievement’. 


Today, researchers are pursuing what many 
have termed a factory-like approach to ana- 
lysing ancient DNA’, with the processing of 
hundreds of samples. 

As a result, we have a much better 
understanding of (among other things) 
which human populations interbred with 
Neanderthals, and which didn’t’; how 


people dispersed across Europe during 
the Bronze Age’; and how pastoralism 
developed in Africa’. 

But such progress comes at a price. 

Extracting the best-quality DNA from 
ancient remains requires the partial 
destruction of those specimens. And once 
bones, teeth, hair and so on are ground > 
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> into dust, future opportunities for using 
them to understand our past are lost. 

We recognize the enormous potential 
of ancient DNA to help reveal human 
history. In fact, as long as interested par- 
ties give their consent, we are hoping to 
apply genomics to the remains of Hawai- 
ian men and women who lived hundreds 
to thousands of years ago. (Our aim is to 
understand how the introduction of lep- 
rosy, smallpox, syphilis and other diseases 
from European colonialists in the eight- 
eenth century have shaped the genomes of 
Native Hawaiians today.) We also recognize 
that some leading labs are taking steps to 
reduce the destructiveness of sampling, 
for instance by developing techniques that 
allow ancient-DNA sequences and radio- 
carbon dates to be obtained from the same 
sample instead of from multiple ones’. 

Yet we are becoming increasingly 
concerned. To our knowledge, no one cur- 
rently has a full list of all the samples from 
ancient humans and closely related species 
examined so far (meaning samples rang- 
ing from hundreds to tens of thousands of 
years old). No one is tracking the success 
rate of data recovery across laboratories 
and samples. And no one knows how many 
specimens are left. 

With such a rapid scale up in analytical 
capacity, the diverse stakeholders involved 
(archaeologists, molecular biologists and 
bioinformaticians; editors and journalists; 
museum curators; and the descendants of 
the populations being studied) must talk. 
They need to establish how to balance 
discovery now with the need to safeguard 
cultural remains in the long term. 

Unless some ground rules are estab- 
lished, future scientists, armed with bet- 
ter, potentially less-invasive methods for 
extracting DNA from ancient samples’ 
could well look back on this era as a time 
of heedless destruction, fuelled by the 
relentless pressure to publish — or what 
one anthropologist has described as an 
“impetuous anxiety for discovery”®. 


HOW BAD IS IT? 
Over the past ten years, there have been 
tremendous successes in education and 
engagement efforts that aim to bring a 
broader range of people (including those 
with interests and responsibilities as 
descendants of particular ancient com- 
munities) into consultations about genetic 
research. For instance, since 2011, a grow- 
ing consortium of genomicists, now in 
North America, Hawaii, Finland, New 
Zealand and Australia, have helped to guide 
summer training programmes for Indige- 
nous people. These educate students about 
the potential uses and misuses of genomics, 
including ancient genomics, as well as how 
to sequence DNA. 

Yet irrevocable decisions continue to 


be made about the sampling of ancient 
specimens, guided by the immediate 
research interests of a few. 

As an example, many researchers focus 
their sampling effort on the petrous bone, 
the hard portion of the temporal bone at 
the base of the skull, which houses the 
intricate structures of the inner ear. This 
dense bone contains a high concentration 
of endogenous DNA. 

Last year, a team looking at the morphol- 
ogy of the inner ear noted that researchers 
were breaking open bony labyrinths and 
drilling into hundreds of petrous bones for 
DNA without first taking photographs, or 
using scanning techniques such as micro 
computed tomography (microCT) to make 
morphological records’. 

Petrous bone could contain uniquely 
high concentra- 


tions of other “Genomic 
aero ee m- research on 
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tures of the inner 
ear, including the 
semicircular canals and cochlea, intact 
bone could reveal insights about an indi- 
vidual’s balance or hearing. 

Some laboratories have used microCT 
scanning, both to preserve data from 
petrous bone, and to guide their drilling 
to minimize destruction of the specimen". 
Unfortunately, such methods have not been 
adopted as a standard, partly because indi- 
vidual groups tend to focus on their own 
research agenda rather than on the bigger 
picture. 

Destruction of fragments of ancient 
bones or teeth is key to many techniques 
used in palaeoanthropology — including 


BONE BONANZA 


The number of ancient samples used in DNA 
analyses has soared in recent years. 
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ancient proteomics, radiocarbon analysis, 
electron-spin resonance dating, stable- 
isotope sampling, dental-calculus sampling 
to assess what food people ate, and the sec- 
tioning of teeth for studies of growth. But 
so far, investigators and commentators have 
begun to routinely apply the terms ‘DNA 
factory’ or ‘industrial-scale’ only to ancient 
genomics (whether in publications, at con- 
ferences or on social media). 

Most of these other techniques are 
applied to tens of samples in any one 
study, occasionally to a single sample. 
Ancient genomics stands apart because the 
decreased cost of sequencing and the rapid 
acceleration of technologies have enabled 
some laboratories to pursue projects involv- 
ing hundreds of samples. The publication 
of such large-scale studies has put pres- 
sure on others to use similarly impressive 
sample sizes. What’s more, analysing the 
movement and evolution of ancient popu- 
lations requires researchers to compare the 
genome of any one sample with those of as 
many of the individual's ancient contempo- 
raries as possible. Thus, studies involving 
bigger sample sizes provide more refer- 
ence data for other investigators to draw 
on, creating a feedback loop. 


RETHINK PERSPECTIVE 
In our view, two changes need to be 
implemented in ancient genomics research. 


Give diverse stakeholders a say. Cur- 
rently, a patchwork of regulations and 
institutions determines whether destruc- 
tive research on ancient human remains 
can proceed. In some jurisdictions, Indig- 
enous communities are formally involved 
in decision-making for research that 
involves the bones of their ancestors. In 
others, the decision could rest in the hands 
of a single curator. 

But on its current trajectory (see ‘Bone 
bonanza’), genomic research on ancient- 
human populations, or on close extinct 
relatives, could hit a ceiling within decades 
because of the scarcity of ancient remains. 
It is therefore urgent that, rather than 
sequencing an ancient genome in the hope 
that something interesting will emerge, 
researchers state up front what question 
they are seeking to answer — and that 
people with diverse perspectives evaluate 
their goals. Because human remains have 
intrinsic value and a role in the beliefs and 
cultures of many peoples of the world, as 
well as scientific value, decisions about 
whether or how to use them for research 
should be governed by a broad group, from 
researchers to the descendants of the popu- 
lations being studied. For instance, if only 
three samples of a given ancient human 
population exist in the world, how many is 
it reasonable to destroy to answer a specific 
question about human migration? 
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The petrous part of the temporal bone is used for radiocarbon dating. 


This ‘question-led’ approach would 
enable people to consider the trade-off 
between collecting ancient DNA data 
today and waiting for future sequencing 
methods, which could potentially yield 
more information less expensively and 
less destructively’. (Sequencing DNA from 
ancient samples was much more hit and 
miss before the emergence in the mid- to 
late 2000s of targeted-capture next-genera- 
tion sequencing, which enables researchers 
to separate endogenous from contaminant 
DNA, and then amplify it.) Also, greater 
engagement from more diverse stake- 
holders on how to handle scarce ancient 
remains as new technologies emerge will 
inspire conversations that bridge disci- 
plines, lead to more accurate models and 


hypotheses and help form lasting partner- 
ships. In our view, such an approach is cru- 
cial for fostering trust in a field in which, 
historically, the decisions of archaeologists 
and geneticists have led to deep distrust in 
many communities”. 


Create accountability. Just as timber and 
minerals are meticulously tracked at truck 
weighing stations and other venues to dis- 
courage the illegal acquisition of resources, 
curators, researchers and others must 
openly document the passage of ancient 
remains from one institution to another 
— and everything that happens to those 
remains along the way. With sucha record, 
all ancient remains would be audited and 
people would know which specimens were 


ground into dust, but did not generate 
useful data, and which efforts generated 
data but did not result in a publication, 
and so on”. 

In the United States, the National Science 
Foundation (NSF) could take the lead on 
establishing such a database. Or grass-roots 
initiatives at museums, such as the Smithso- 
nian Museum of Natural History in Wash- 
ington DC or the Bernice Pauahi Bishop 
Museum in Honolulu, Hawaii, could help 
to shift practice. Buy-in from the research 
community could easily be obtained if ref- 
erees and grantors required declaration 
of all sampling information. Importantly, 
such a decentralized approach would help 
to ensure that knowledge about ancient 
samples is not limited to a few groups”. 


WASTED RESOURCES 
Many of the great archaeological sites of 
prehistory are now empty thanks to early 
archaeologists — sometimes little more 
than treasure-hunters — commanding 
armies of unskilled workers to scoop up 
the contents of caves, tombs and burial 
grounds. When so little was known, the 
bar was low; any discovery was interesting, 
and little or nothing was left for future gen- 
erations. In fact, even as late as the 1990s, 
large sections of ancient human skeletons 
were destroyed for radiocarbon and other 
analyses that can now be accomplished 
using much smaller portions of bone. 
Rather than repeat the mistakes of the 
past, future generations of scientists — from 
all countries of the world and from all sectors 
of society — must be given the opportunity 
to interpret our shared history. = 


Keolu Fox is an assistant professor of 
biological anthropology at the University of 
California, San Diego. John Hawks is the 
Vilas-Borghesi Distinguished Achievement 
Professor of Anthropology at the University 
of Wisconsin—Madison. 

e-mails: pkfox@ucsd.edu; jhawks@wisc.edu 
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John Cade, pictured in 1974, was the first person to test lithium as a treatment for biopolar disorder. 


PHARMACOLOGY 


The serendipitous 
Story of lithium 


Douwe Draaisma praises a gripping history of 
psychiatry’s most consistently effective medicine. 


Australian psychiatrist, discovered a 

medication for bipolar disorder that 
helped many patients to regain stability 
swiftly. Lithium is now the standard treat- 
ment for the condition, and one of the most 
consistently effective medicines in psychia- 
try. But its rise was riddled with obstacles. The 
intertwined story of Cade and his momentous 
finding is told in Lithium, a compelling book 
by US psychiatrist Walter Brown. 

Bipolar disorder, labelled manic-depressive 
illness until 1980, affects around 1 in 100 
people globally. Without treatment, it can 
become a relentless cycle of emotional highs 


S™ 70 years ago, John Cade, an 


and lows. Suicide rates for untreated people 
are 10-20 times those in the general popu- 
lation. Fortunately, lithium carbonate — 
derived from the light, silvery metal lithium 
— can reduce that figure tenfold. 

Brown’s telling of Cade’s eventful life 
covers much of the same ground as Finding 
Sanity (2016), a rather hagiographic bio- 
graphy by Greg de Moore and Ann West- 
more. What Brown does superbly well is to 
show that Cade made his discovery without 
access to advances in technology or to mod- 
ern facilities — and almost despite them. His 
finding was the happy result of being forced 
to work with the simplest of means. 
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During the Second World War, Cade was 
interred for more than three years in the 
notorious Japanese prisoner-of-war camp at 
Changi in Singapore. He was put in charge 
of the psychiatric section, where he began 
to note the decisive link between certain 
food deficiencies and diseases in his fellow 
prisoners. A lack of B vitamins, for instance, 
caused beriberi and pellagra. 

After the war, he pursued his investiga- 
tions. Working from an abandoned pantry 
in Bundoora Repatriation Mental Hospital 
near Melbourne, Australia, he began to collect 
urine samples from people with depression, 
mania and schizophrenia, aiming to discover 
whether some secretion in their urine could 
be correlated to their symptoms. With no 
access to sophisticated chemical analysis and 
largely unguided by theory, Cade injected the 
urine into the abdominal cavities of guinea 
pigs, raising the dose until they died. The 
urine of people with mania proved especially 
lethal to the animals. 

In further experiments at Bundoora, Cade 
found that lithium carbonate — which had 
been used to treat conditions such as gout 
since the nineteenth century — reduced the 
toxicity of patients’ urine. Cade also noticed 
that a large dose of the medication tended to 
calm the guinea pigs. He could turn them on 
their backs, and the normally restive rodents 
would gaze placidly back at him. He won- 
dered whether lithium could have the same 
tranquillizing effect on his patients. After 
trying it out on himself to establish a safe 
dose, Cade began treating ten people with 
mania. In September 1949, he reported fast 
and dramatic improvements in all of them in 
the Medical Journal of Australia (J. F.J. Cade 
Med. J. Aus. 2, 349-351; 1949). The major- 
ity of these patients had been in and out of 
Bundoora for years; now, five had improved 
enough to return to their homes and families. 

Cade’s paper went largely unnoticed at 
the time. Soon, moving along the rows of 
the periodic table like a beachcomber on a 
shore, Cade began to experiment with salts 
of rubidium, cerium and strontium. None 
proved therapeutic. In 1950, he also aban- 
doned his experiments with lithium. The 
therapeutic dose of lithium is dangerously 
close to a toxic dose, and that year one of 
his patients — “W.B’, 
a man with a 30-year 
history of bipolar dis- 
order — appeared in 
the coroner’s records 
as having died from 
lithium poisoning. 

Brown also weaves 
in the story of Mogens 
Schou. The Danish 
psychiatrist was as 


Lithium: A Doctor, imuchia: heveas Cade. 


a Drug, anda 


Breakthrough fighting long and hard 
WALTER A. BROWN to get lithium accepted 
Liveright (2019) as a treatment for 
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bipolar disorder. He knew the condition inti- 
mately, because his brother had it. Starting in 
the 1950s, Schou teamed up with fellow psy- 
chiatrist Poul Baastrup to conduct a series 
of lithium experiments with ever stricter 
conditions, culminating in a double-blind, 
placebo-controlled clinical trial. Published in 
1970 in The Lancet, this established beyond 
doubt that lithium was effective for most peo- 
ple with bipolar disorder, including Schou’s 
brother (P. C. Baastrup et al. Lancet 296, 
326-330; 1970). 

Today, lithium helps to stabilize the moods 
of millions of people with the condition, 
although the dose must be carefully con- 
trolled and it can have unpleasant side effects, 
such as nausea and trembling. Its mechanism 

is still something 


“It is unlikely of a mystery. Most 
thatapresent- research targets the 
day researcher delicate chemistry 
would get supporting the func- 
permissionfor — tioning of neuro- 
experiments transmitters; but as 
like Cade’s.” yet, definitive results 


are lacking. Nor has 
the cause of the disorder been established. It is 
clear that there is a genetic component: if one 
of a pair of monozygotic twins (who share all 
their genetic material) has the disorder, there 
is around a 60% chance that the other will 
have it. In dizygotic twins, the figure is 10%. 
Finishing Lithium, readers are left with a 
sense of paradox. The drug that set off the 
‘psychopharmacological revolution of the 
1950s, with antipsychotics and antidepres- 
sants arriving in its wake, is in many ways 
a stunning success. Yet it was developed in 
a ramshackle pantry, and the bottled urine 
samples were stored in the Cade family refrig- 
erator. Moreover, in retrospect, the discovery 
of lithium seems in part related to an errone- 
ous interpretation on Cade’ part. The ‘tran- 
quillized’ guinea pigs were probably showing 
the first symptoms of lithium poisoning: leth- 
argy is still a warning sign of an overdose. And 
the step from guinea pigs to humans was a 
“conceptual leap’, as Brown kindly puts it — 
hardly a deduction from sound theory. It is 
unlikely that a modern researcher would get 
permission for experiments such as Cade’. 
Cade’s findings could easily have found- 
ered if Schou and others, such as US medical 
researcher John Talbott, hadn't followed up on 
his 1949 paper. Thus, hailing Cade as a trail- 
blazer is valid — but without Schou and the 
rest, there would be no trail. Thanks to them 
all, this ubiquitous element, easily manufac- 
tured and never patented by pharmaceutical 
companies, remains both cheap and invalu- 
able as a treatment for a troubling disorder. = 


Douwe Draaisma is professor of the 

history of psychology at the University of 
Groningen in the Netherlands, and author of 
Disturbances of the Mind. 
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Books in brief 


How to Be a Dictator 

Frank Dikétter BLOOMSBURY (2019) 

For this magisterial study on the misuse of power, historian Frank 
Dikdtter analysed the strategies of eight brutal twentieth-century 
dictators. The result reveals how weak, largely unelectable men such 
as Adolf Hitler and Joseph Stalin maintained cults of personality 
through tireless self-glorification, aided by propaganda and the 
illusion of popular consent. Dikétter’s insights into their modus 
operandi — “to sow confusion, to destroy common sense, to enforce 
obedience, to isolate individuals and crush their dignity” — make for 
salutary reading at a time of persistent attacks on democracy. 


End Times 

Bryan Walsh HACHETTE (2019) 

In this sweeping “brief guide to the end of the world”, journalist Bryan 
Walsh details the science on existential risks, from supervolcanoes 

to global war — many of them amplified by chaotic governance. He 
explores United Nations climate conferences, synthetic-biology labs 
and the US nuclear command-and-control system. He disentangles 
the maths of asteroid strikes and the complexities of gene editing. 
And, as billionaires focus on escape (boltholes in New Zealand, space 
colonization), Walsh envisions survival for the rest of us — a scenario 
of subterranean refugees subsisting on insects, fungi and rats. 


Don’t Believe a Word 

David Shariatmadari WEIDENFELD & NICOLSON (2019) 

Language, notes writer David Shariatmadari, is a hall of mirrors: 

we can understand it only through language itself. His assured 

tour takes in the origins of language (he argues for nurture over 
nature) and deconstructs a plethora of myths. These include the 
supposed demise of linguistic standards, the question of animal 
communication, the vagaries of translation and the comparative 
richness of vocabularies. Insights abound, from the blurred 
boundary between Hindi and Urdu, to Australian languages in which 
the grammar changes when the speaker’s mother-in-law is present. 
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Proof! 

Amir Alexander FARRAR, STRAUS AND GIROUX (2019) 

In his opus Elements, fourth-century BC Greek mathematician Euclid 
created a “complete world of mathematical truths”. Yet, as historian 
Amir Alexander’s subtle chronicle shows, Euclid’s ideas really 
blossomed only in the Renaissance. Then, luminaries such as Leon 
Battista Alberti codified what they saw as the hidden geometries of the 
Universe, including the rules of perspective. The geometric imperative 
went on to shape the French monarchy’s rigidly hierarchical world 
view, symbolized by the formal gardens of Versailles, before emerging 
in the architecture of power from New Delhi to Washington DC. 
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The Curious World of Seaweed 

Josie Iselin HEYDAY (2019) 

From the silken greens of Ulva fenestrata to the bulbous glories of 
Botryocladia pseudodichotoma, seaweeds are stars of the intertidal 
zone. This paean by Josie Iselin, a fine-art photographer, and 
writer celebrates both their remarkable morphology and tactility 
(“smooth and slimey and tough and stretchy”), and the history of 
phycology. Iselin studs her evocative text with exquisite ‘portraits’ 
of algal species — a mix of archival illustrations, snaps of historical 
specimens and luminous shots taken using a flatbed scanner. A 
mesmerizing swim through a liminal world. Barbara Kiser 
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Funding open-access 
papers after 2024 


Several publishers are 
concerned about the timeline 
for implementing Plan S, the 
European initiative that will 
make all research papers free to 
access (see Nature 561, 17-18; 
2018). Their main concern is 
whether their markets will be 
ready for a ‘pay to publish’ model 
by 2024, when funders’ support 
for transformative agreements 
ends. As co-chairs of the 
implementation task force of the 
international research-funder 
consortium cOAlition S (see 
www.coalition-s.org), we wish to 
clarify our position with regard 
to financially supporting the 
important transition to full open 
access after 2024. 

We recommend that open- 
access publication fees should be 
covered by funders or research 
institutions, not by individual 
researchers (see go.nature. 
com/33rdttn). Our 2019 
guidelines for implementing 
Plan S indicate how we, as 
funders, intend to help finance 
full and immediate open-access 
publication until 2024 (see Nature 
https://doi.org/gf3x2r; 2019). 

After 2024, we will be 
encouraging institutional 
libraries and large consortia to 
switch from ‘read and publish’ 
agreements with publishers to 
‘pure publish’ deals for portfolios 
of subscription journals that 
have become open-access 
journals. The cOAlition S 
funders will contribute to 
financing such deals, which will 
be more cost-effective and have 
fewer transaction costs than a 
single-paper charging system. 
The financial transaction would 
then no longer be between the 
author and the editor or journal, 
removing any concerns about 
perverse incentives for lax 
quality control. 

We look forward to 
working with publishers who 
demonstrate leadership in this 
important new era of research 
reporting. 

John-Arne Rottingen Research 
Council of Norway, Oslo, Norway. 


David Sweeney Research 
England, UK Research and 
Innovation, London, UK. 
jro@forskningsradet.no 


Ditch mouse swim 
test for depression 


Having practised psychiatry 
for 24 years, I was pleased to 
see that the value of the mouse 
‘forced-swim test’ is being called 
into question by researchers 
studying human depression 
(Nature 571, 456-457; 2019). 
Besides being shockingly cruel, 
this behavioural test misses the 
mark in approximating clinical 
depression in people. 

Physical and emotional 
abuse (such as that associated 
with the test) is likely to induce 
hopelessness in humans and 
animals alike. In my experience, 
however, hopelessness is 
just one symptom of clinical 
depression in humans; abused 
people do not always meet the 
full criteria for major depressive 
disorder; and most individuals 
with the disorder are not 
currently being abused. 

In my view, the complexity of 
human-brain function means 
that interpretations based on 
simplistic animal-behavioural 
testing are questionable. Data 
from clinical studies and from 
technologies that use human 
induced pluripotent stem cells 
offer a more rational approach 
for research into mental health. 
Jaymie Shanker Shaker Heights, 
Ohio, USA. 
jaymieshanker@gmail.com 


Enable accreditation 
of scientific software 


We would see improvements 
in the long-term accuracy and 
reliability of academic open- 
source software if journals 
required submitted software 
to be accredited, and if funders 
were to establish a mechanism 
for accrediting it (see Nature 571, 
133-134; 2019). 

Funding bodies could improve 
the quality and reproducibility of 
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scientific software by creating a 
software-engineering task force 
that would cover code reviews, 
training workshops and standards 
development, for example. 

A software-standards 
accreditation scheme from large 
funding organizations would 
carry considerable clout and help 
to usher in cultural change. The 
scheme would ensure minimum 
standards in reproducibility, 
documentation and security. 
Different aspects such as code 
coverage (the proportion of 
code that is automatically 
tested) could be evaluated using 
automated metrics and tests. 

Public parts of code would 
be subject to automated 
vulnerability testing for common 
security issues. They would also 
need to have basic application- 
programming-interface 
documentation, which describes 
how programmers can use each 
software function and how 
other code can interface with it. 
Alexander L. R. Lubbock 
Vanderbilt University, Nashville, 
Tennessee, USA. 
alex.lubbock@vanderbilt.edu 


Keep a close eye 
on the tiger 


The good news that India’s 
wild tiger numbers have been 
increasing by 6% annually since 
2006 is offset by reported declines 
in their habitat (see go.nature. 
com/2tig959). Habitat loss is 
a particular concern for the 
genetically unique populations 
in the northeast of the country. 
Conservation efforts must now 
focus on protecting those areas 
and improving the connectivity 
of the habitat corridors that are 
crucial for the animals’ dispersal. 
Tiger surveys, produced in 
conjunction with the Wildlife 
Institute of India, are run 
every four years by the Indian 
government. The 2018 survey 
was unprecedented in intensity 
and scale, with 77,000 tiger 
photographs taken from motion- 
triggered camera pairs placed 
in 27,000 locations. Together 
with some 35 million photos, it 
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identified more than 80% of the 
country’s 3,000 tigers. 

Surveys on this scale entail 
sifting through tens of millions 
of wildlife photos, of which 
only a tiny fraction are of 
tigers. Research teams in India 
and elsewhere are developing 
artificial-intelligence tools to 
automate the process. This will 
improve conservation efforts 
worldwide by teaching us more 
about the effects of human 
pressures on the abundance and 
distribution of wildlife. 

Chris Carbone Zoological 
Society of London, UK. 

Matt Hayward University of 
Newcastle, Australia. 

Joseph Bump University of 
Minnesota, USA. 
chris.carbone@ioz.ac.uk 

C.C., M.H. and J.B. declare 
competing interests; see go.nature. 
com/2z6tbiv. 


Testing the impacts 
of sea-bed mining 


Our DISCOL experiment of 1989 
was intended to explore some 
of the environmental impacts 
of sea-bed mining (see Nature 
571, 465-468; 2019). It did not 
‘simulate’ industrial mining of the 
deep sea as you imply, because 
it did not cause the type and 
extent of sea-floor disruption and 
habitat destruction that would 
be associated with commercial 
extraction processes. We 
simply provoked a mechanical 
disturbance of the sea floor and 
studied the recolonization and 
restoration of the disturbed area 
over a seven-year period. 

Until industry has developed 
a test system for extracting 
metalliferous nodules from the 
sea floor, it will not be possible 
to simulate the actual impacts of 
mining or to monitor its effects 
on sediments and communities. 
It will then take time to do the 
environmental investigations 
and evaluations that are required 
before commercial mining can 
proceed. 
Hjalmar Thiel, Gerd Schriever 
Hamburg, Germany. 
hjalmar.thiel@hamburg.de 
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Nanotube computer scaled up 


Electronic devices that are based on carbon nanotubes have the potential to be more energy efficient than their silicon 
counterparts, but have been restricted in functionality. This limitation has now been overcome. SEE ARTICLE P.595 


FRANZ KREUPL 


| ; or many decades, progress in electronics 


has been driven by a gradual reduction in 

the size of silicon transistors (electronic 
switches). However, this scaling is becom- 
ing increasingly difficult and is now yielding 
diminishing returns. Transistors based on 
semiconducting carbon nanotubes are clear 
front runners as replacements for silicon tran- 
sistors in advanced microelectronic devices. 
But imperfections inherent in carbon nano- 
tubes, and challenges in handling these tiny 
objects, have prevented their use in real-world 
microelectronic applications. On page 595, 
Hills et al.’ report a major advance in this field: 
a 16-bit computer that is built entirely from 
carbon-nanotube transistors. 

To achieve this milestone, the authors 
needed to develop a viable nanotube- 
transistor technology that provides two kinds 
of transistor: p-type metal-oxide-semi- 
conductor (PMOS) and n-type metal-oxide- 
semiconductor (NMOS). In digital electronics, 
a computation is divided into a sequence of 
elementary (logic) operations that are carried 
out by components called logic circuits. 
The present design of these circuits in the 
electronics industry is based on complemen- 
tary metal-oxide-semiconductor (CMOS) 
technology, which requires both PMOS and 
NMOS transistors. 

A PMOS (or NMOS) transistor is switched 
on when a negative (or positive) voltage is 
applied to an electrode known as the gate. 
This electrode controls the conductivity of 
the channel (in this case, formed by carbon 
nanotubes) between two other electrodes 
(the source and the drain). When a PMOS 
transistor and an NMOS transistor are inter- 
connected in series, the result is an element 
called an inverter (Fig. 1). If a low voltage is 
applied to such an inverter, the output voltage 
will be high, and vice versa. This element is the 
basic ingredient of all the logic circuits used in 
Hills and colleagues’ computer. 

The authors made their transistors by 
forming a network of randomly distributed, 
high-purity (99.99%) semiconducting nano- 
tubes on a substrate. The formation process 
resembles pouring a bowl of cooked spaghetti 
onto a surface and then removing all the 
strands that are not in direct contact with the 
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Figure 1 | A carbon-nanotube inverter. a, Hills et al.’ demonstrate a computer that uses basic elements 
called inverters. Each of these inverters contains two kinds of transistor (electronic switch): a p-type 
metal-oxide-semiconductor (PMOS) transistor and an n-type metal-oxide-semiconductor (NMOS) 
transistor. These transistors are interconnected in series and are formed on a silicon oxide substrate. Each 
transistor consists of three electrodes known as the source, the gate and the drain; the source and the 
drain are separated by a channel that is formed of semiconducting carbon nanotubes. The micrometre- 
scale width and length of a channel are indicated. b, Ifa low voltage is applied to the inverter, the output 


voltage will be high, and vice versa. 


surface. The result is a substrate covered with 
roughly a single-layer of randomly oriented 
nanotubes. 

Hills et al. then deposited metal on the 
nanotubes to connect them to the source and 
the drain. The work function of this metal (the 
energy needed to remove an electron from its 
surface) depended on whether the device was 
a PMOS oran NMOS transistor. The authors 
covered the rest of each nanotube with care- 
fully selected and trimmed oxide materials, to 
isolate the nanotubes from their surroundings 
and to adjust their properties. In principle, the 
substrate does not need to be made of silicon; 
it simply needs to be flat. Moreover, the pro- 
cessing happens at relatively low temperatures 
(about 200-325 °C), so that stacking of further 
functional layers would easily be possible. 

Contemporary computer design is based 
on libraries of standard cells — sets of logic 
operations that can be interconnected for 
greater functionality. Hills and colleagues 
devised all the standard cells required to 
make their computer’s architecture using 
commercially available, conventional design 
tools. Because the semiconducting nanotubes 
had a purity of 99.99%, about 0.01% of them 
were metallic (non-semiconducting) and 
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could have jeopardized the circuits. However, 
certain combinations of standard cells are more 
vulnerable to the presence of metallic nano- 
tubes than are others. The authors therefore 
enforced modified design rules that excluded 
such vulnerable combinations. Equipped with 
these tools, they were able to design, fabricate 
and test their computer by letting it execute 
‘Hello, World’ — a simple program that outputs 
the message “Hello, World” when run. 

Hills and colleagues’ nanotube computer 
is based on CMOS technology, runs 32-bit 
instructions on 16-bit data and has a 
transistor-channel length of roughly 1.5 micro- 
metres. It can therefore be compared to the 
silicon-based Intel 80386 processor, which 
was introduced in 1985 and had similar speci- 
fications. The early 80386 could process its 
instructions at a frequency of 16 megahertz 
(see go.nature.com/33clrla), whereas the 
nanotube computer has a maximum process- 
ing frequency of about 1 MHz. The reason for 
this difference lies in the capacitances (charge- 
storage abilities) of the electronic components, 
and in the amount of current that the smallest 
transistor can deliver. 

Digital logic simply involves charging 
and discharging the transistor gates and the 


interconnects. The speed of charging and 
discharging depends on the amount of current 
that a transistor can provide, which is related 
to the width and length of the transistor. A 
well-designed silicon transistor can deliver 
roughly one milliampere of current per micro- 
metre of width (1 mA pm’) (see go.nature. 
com/2z4wjda). By contrast, the typical nano- 
tube transistors used by Hills et al. can provide 
only about 6 pA um". This is the main feature 
that will need improvement in future versions 
of the computer. 

The first step for increasing the electric 
current is to reduce the transistor-channel 
length. It has already been demonstrated” 
that the channel lengths of nanotube transis- 
tors can be scaled down to 5 nm. The second 
step is to increase the density of nanotubes in 
each channel from as little as 10 nanotubes per 
micrometre to 500 nanotubes per micrometre. 


TUMOUR BIOLOGY 


For these networks of randomly distributed 
nanotubes, there might be an upper limit on 
the achievable density, but a deposition tech- 
nique has been shown’ to boost the current 
in such networks to 1.7mA um". The third 
step is to decrease the width of the transistors, 
and thereby the widths of the source and the 
drain, which would allow these electrodes to 
be charged and discharged more quickly’. 
These scaled-down transistors are essential 
for nanotube-based CMOS technology that 
operates at gigahertz frequencies’. 

Hills and colleagues’ achievement is based 
on averaging the performances of several 
nanotubes in each transistor channel. In the 
large-scale nanotube computer of the distant 
future, the PMOS and NMOS transistors will 
contain only one nanotube. These nanotubes 
will need to be semiconducting: no design 
trick will provide a workaround if one of the 


Cells tagged near an 
early spread of cancer 


Cancer cells that travel to a distant site can prompt the normal neighbouring cells 
at that location to create a tumour-promoting microenvironment. A tool that 
identifies these normal cells offers a way to study this process. SEE ARTICLE P.603 


MARIE-LIESSE ASSELIN-LABAT 


ost types of cancer are lethal after 
M tumour cells have left their primary 

site of growth and moved to colonize 
a distant organ through a process termed 
metastasis. Whether a cancer cell will meta- 
stasize is determined not only by the cell itself, 
but also by the microenvironment of that far- 
away site called the metastatic niche’. Onlya 
small number of the cells that reach such a new 
location will successfully establish a presence 
there and proliferate”. The early processes that 
aid cancer-cell growth at secondary locations 
remain poorly understood, partly because 
of a scarcity of suitable tools with which to 
analyse these events. On page 603, Ombrato 
et al” describe an innovative in vivo method 
for identifying and isolating the rare normal 
cells that are in close contact with cancer cells 
that have just migrated to a secondary site. This 
approach should help to clarify the early direct 
interactions between metastatic cells and 
neighbouring normal cells that help to shape 
the formation of a metastatic niche. 

Ombrato and colleagues engineered mouse 
breast cancer cells to express a fluorescent pro- 
tein containing a region of amino-acid residues 
that make it permeable to lipids (Fig. 1); this 
feature enabled the protein to be released from 
the cancer cell in a soluble form that could be 


taken up by neighbouring cells. The authors 
studied a model of metastasis in which mouse 
breast cancer cells that expressed this protein, 
plus a different fluorescent protein that could 
be used to specifically monitor cancer cells, 
were injected into the mouse tail vein and 
subsequently colonized the lung. 


Breast cancer 
<2 cell 
@.) Lipid-permeable ; 
© fluorescent protein 
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two nanotubes in an inverter is metallic. 

The authors’ work is a great accomplishment 
that touches on many research topics — from 
materials science to processing technology, 
and from circuit design to electrical testing. 
However, more effort is required before the 
team will need a sales department. = 


Franz Kreupl is in the Department of Hybrid 
Electronic Systems, Technical University of 
Munich, 80333 Munich, Germany. 

e-mail: franz.kreupl@tum.de 
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Analysis of lung tissue revealed that healthy 
cells located within a distance of five cell 
layers from cancer cells took up the protein, 
enabling the specific analysis of healthy cells in 
close contact with an emerging site of tumour 
growth. Ombrato et al. noted a direct corre- 
lation between the number of cancer cells in 
the lung and the number of neighbouring cells 
that took up the protein. These neighbouring 
cells included immune cells, which are known* 
to aid the colonization of the lung by breast 
cancer cells. 

Previous studies have used other techniques 
to identify cells in the vicinity of malignant 
tumours, by, for example, tagging the cells 
that specifically receive vesicles released from 
tumour cells’. The advantage of Ombrato and 
colleagues’ technique is that it offers a way to 
tag probably any type of cell present in the 
vicinity of a metastatic site. 

The lipid-permeable fluorescent protein is 
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| 
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Figure 1 | A tool for identifying healthy cells in the vicinity of cancer cells. Ombrato et al.’ engineered 
a fluorescent protein to contain amino-acid residues conferring lipid permeability, which enables the 
protein to enter cells. The authors engineered mouse breast cancer cells to express this protein, and 
injected the cells into the tail veins of mice. The cancer cells then colonized lung tissue at a site that is 
termed a metastatic niche. The fluorescent protein released there from tumour cells was taken up by the 
neighbouring healthy lung cells. The authors carried out direct in situ analysis, using approaches such 

as microscopy, to assess these healthy cells of the metastatic niche. The lung tissue was then removed, 
and the presence of the lipid-permeable fluorescent protein permitted the isolation and molecular 
characterization of these cells. This information allowed the authors to carry out functional tests in vitro 
to study how this type of healthy cell affects tumour growth. 
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stable in recipient cells for only approximately 
48 hours. Thus, the authors’ method allows 
an evaluation of the initial changes that occur 
at metastatic sites through time, but is not 
suitable for long-term tracking. 

Cancer cells can alter their local environ- 
ment to promote tumour growth through pro- 
cesses such as driving blood-vessel formation 
to increase nutrient supply, or causing changes 
that protect the tumour against immune 
attack®. The rare cancer cells that success- 
fully thrive at a distant site usually alter the 
microenvironment there to promote their 
growth by, for example, starving normal cells 
of metabolite molecules to increase nutrient 
availability’, or preparing a microenvironment 
that promotes tumour growth*”. Ombrato 
and colleagues used their tool to identify and 
isolate healthy cells for molecular analysis 
by methods that included RNA sequencing, 
to track changes that might promote the 
formation of the metastatic niche. 

The authors showed that normal lung cells 
(of a type called an epithelial cell) that sur- 
rounded invading breast cancer cells belonged 
to acell lineage known as alveolar type 2 (AT2) 
cells. Metastasizing cells benefited from this 
type of microenvironment, as demonstrated 
by Ombrato and colleagues’ observation that 
cancer cells grown with lung epithelial cells 
in vitro had a high proliferation rate. 

The AT2 cells that the authors identified in 
the vicinity of the invading cancer cells also 
had characteristics of a comparatively undif- 
ferentiated sort of lung cell — a stem cell”. 
In the lung, most AT2 cells are fully differen- 
tiated, with only a small subset behaving like 
stem cells’®. Do these cancer cells prefer to 
locate near lung stem cells, or do they drive 
the recruitment of such cells to their vicin- 
ity? Alternatively, might the cancer cells drive 
neighbouring differentiated AT2 cells to take 
ona stem-cell-like fate? 

To investigate these possibilities, Ombrato 
and colleagues studied cancer cells grown 
in vitro with AT2 cells. This revealed that the 
presence of the cancer cells boosted the capa- 
city of AT2 cells to act as stem cells and to give 
rise to various types of differentiated lung cell, 
compared with AT2 cells grown in the absence 
of cancer cells. 

Future in vivo studies combining Ombrato 
and colleagues labelling approach with other 
methods for tracing the lineage of lung stem 
cells will undoubtedly help to resolve how 
metastatic breast cancer cells create a micro- 
environment that nurtures tumour cells in the 
lung. The observation that breast cancer cells 
form a metastatic niche near lung stem cells is 
reminiscent of a previous observation: when 
prostate cancer cells metastasize to the bone, 
they settle near stem cells in the bone marrow, 
which helps to provide an environment that 
supports tumour growth’®. 

Ombrato and colleagues’ method holds 
great promise for addressing why a given 
type of cancer cell preferentially migrates 


to a particular initial secondary site, such 
as the bone marrow or lung. This key ques- 
tion has not been fully answered. Using the 
authors’ technique to study breast cancer cell 
lines that have distinct organ preferences for 
their secondary sites” should provide insight 
about the mechanisms underlying such 
preferences. 

It will be important to determine whether 
the authors’ findings in mice are relevant for 
human cancer. In samples of human lung 
tissue containing metastatic breast cancer cells, 
Ombrato et al. found that lung epithelial cells 
neighbouring the tumour expressed a higher 
level of a protein associated with proliferation 
than did lung epithelial cells located farther 
away from the site of tumour invasion. Analy- 
ses to understand how this type of dividing cell 
supports breast cancer growth are essential 
areas for future studies. 

If migrating tumour cells could be prevented 
from lodging in distant organs, this would 
have a major positive clinical impact. Because 
cancer cells often have a high level of genomic 
alteration, focusing instead on their neigh- 
bouring cells, which are genetically more 
stable, might be an effective strategy for 
targeting a metastatic niche. The complex- 
ity of the microenvironment at such sites, in 
which components such as immune and non- 
immune cells affect the settlement of cancer 
cells, will need to be characterized in depth to 
test whether manipulation of such regions is 
a potential therapeutic strategy. Ombrato and 
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colleagues’ method provides a crucial way 
forward for such endeavours. = 
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Animmune-cell 
barrier protects joints 


Inflammation and the repair of damaged tissues are regulated by immune cells 
called macrophages. The finding that they form a layer that shields mouse joints 
from damage has implications for the treatment of arthritis. SEE LETTER P.670 


CHRISTOPHER D. BUCKLEY 


commonly function as scavenger-like 
(phagocytic) cells that ingest and remove 
damaged cells. Culemann et al.' report on 
page 670 that the macrophages present in 
joints also fulfil an unexpectedly different role. 
Macrophages derive from two main 
cellular lineages”. One lineage arises from 
bone-marrow-derived immune cells called 
monocytes. The other lineage is monocyte 
independent, and is derived from cells that 
disperse into the tissues during embryonic 
development’. The tissue-resident macro- 
phages in this lineage have distinctive 
gene-expression profiles** that depend on the 


[ons cells called macrophages 
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particular tissue in which they reside. 

Rheumatoid arthritis is an immune- 
mediated disease associated with inflammation 
and the destruction of the cartilage and bone 
in joints, and macrophages have a key role 
in the initiation of this condition. However, 
little is known about the relative contribu- 
tion of the two lineages of macrophages to the 
development and function of joints in health 
and disease. To add to the complexity, macro- 
phages exist as various subsets, some of which 
are pro-inflammatory, whereas others are 
anti-inflammatory and aid tissue repair’. 

To study macrophages, the authors began by 
focusing on a protein called CX3CRI, which 
is expressed on monocytes and macrophages. 
The authors engineered CX3CRI1-expressing 


cells in mice to make a red fluorescent protein 
so that the cells could be tracked in vivo. These 
cells were monitored in knee joints using an 
approach called 3D light-sheet fluorescence 
microscopy, and the joint tissue was treated 
using a technique that enabled the authors to 
obtain ‘optical clearance, which improves the 
visualization of internal structures’. 

Unexpectedly, the authors’ observations 
revealed that CX3CR1-expressing macro- 
phages exist as a layer of cells that forms a 
barrier, similar to a thin protective membrane, 
in the healthy joint (Fig. 1). This barrier forms 
as an outer layer of cells in the synovium, a 
region of the tissue that lines the joint. The 
barrier layer forms in a part of the synovium 
called the lining layer, and it physically sepa- 
rates the synovial fluid (which bathes the joint) 
from the sublining layers of the synovium. The 
CX3CRI1-expressing barrier-forming macro- 
phages are found adjacent to a layer of cells 
called fibroblasts in the lining layer. 

The authors carried out RNA sequencing, 
including single-cell sequencing, to profile 
the barrier macrophages. These cells express 
genes typically associated with barrier forma- 
tion in a type of non-immune cell called an 
epithelial cell. For example, the macrophage 
profile included genes that encode proteins 
associated with the formation of a structure 
called a tight junction that connects epithelial 
cells by forming a ‘seal’ between adjacent epi- 
thelial cells. This is surprising, because mac- 
rophages are usually thought of as having a 
signalling or scavenging role, rather than hav- 
ing a structural, barrier-like function. 

Using a mouse model of arthritis in which 
macrophages could be tracked by engineering 
them to be fluorescent, the authors observed 
that the barrier layer was highly dynamic. 
When arthritis was induced, the layer under- 
went active remodelling that loosened the 
physical interactions between barrier macro- 
phages and lining-layer fibroblasts. Like other 
types of tissue-resident macrophage, the 
barrier macrophages can ingest and remove 
inflammatory immune cells called neutrophils 
that accumulate and die in the synovial fluid 
in arthritis. 

When the authors induced arthritis in mice 
at the same time as they disrupted the barrier- 
forming layer of macrophages through genetic 
or pharmacological manipulation, arthritis 
was more severe than in animals in which the 
layer was intact. It would be interesting to test 
whether transferring barrier macrophages 
directly into mouse joints could suppress 
arthritis. 

To explore the origin of the barrier-forming, 
CX3CR1-expressing macrophages, the 
authors used intricate fate-mapping experi- 
ments, which revealed that these cells are not 
derived from monocytes. They also found that 
monocytes did not give rise to the other type 
of macrophage that resides in the joint, termed 
an interstitial synovial macrophage, which 
populates the sublining layer. The authors’ 
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Figure 1 | Barrier macrophages in the joint. Culemann et al.' studied immune cells called macrophages 
in mouse and human joints. Joints are surrounded by a tissue called the synovium, which is formed from 
layers of cells called the lining and the sublining layers. The authors discovered that certain macrophages 


form a cell layer that protects joints from the inflammatory immune-cell attacks on bone and cartilage 
that are associated with arthritis. This barrier is formed in the lining layer, adjacent to a layer of cells 
called fibroblasts. The barrier-forming macrophages express proteins associated with a type of barrier- 
forming cell called an epithelial cell, and these proteins form structures called tight junctions that ‘seal’ 
cells together. Barrier-forming macrophages arise from a type of macrophage called an interstitial 


macrophage, which resides in the sublining layer. By contrast, non-resident macrophages enter the joint 
from blood vessels. These cells, which can drive inflammation, arise from immune cells called monocytes. 


data are consistent with a model in which 
interstitial macrophages give rise to barrier 
macrophages. 

RNA-sequencing experiments revealed that 
interstitial macrophages can be divided into 
two groups. One group expresses the gene 
Retnla, whereas the other has a high level of 
expression of the genes that encode the pro- 
teins MHC class II and aquaporin. Cells of the 
latter group divide and differentiate to form 
either barrier macrophages, or interstitial 
macrophages that express Retnla. 

To analyse the macrophage subsets that 
arise as arthritis develops, compared with 
those present in an uninflamed joint, the 
authors carried out further single-cell RNA 
sequencing. As expected from previous work’, 
monocyte-derived macrophages that produce 
pro-inflammatory molecules accumulated in 
the arthritic joint. They are recruited into the 
joint from the bloodstream, exiting blood 
vessels to enter the sublining layer. During 
the influx of these pro-inflammatory macro- 
phages, the barrier macrophages maintained 
their anti-inflammatory role, expressing the 
proteins needed for them to remove dead 
neutrophils from the joint. 

When the authors compared their 
single-cell RNA data from mice with similar 
data sets® available from an analysis of the 
joints of people with rheumatoid arthritis, 
the gene-expression profiles of the macro- 
phage subsets matched up between the two 
species. This suggests that cells similar to 
the barrier and interstitial macrophages 
in mice might also exist in humans, and 


thus be relevant to human disease. 

The authors found that barrier macrophages 
were almost totally absent in synovial samples 
from people with active rheumatoid arthritis, 
whereas they made up 10% of the macro- 
phage population in samples from people who 
have osteoarthritis, a type of arthritis that is 
not associated with inflammation. It would 
be interesting to learn whether the popula- 
tion of barrier macrophages is restored in 
people whose rheumatoid arthritis is being 
successfully treated and is in remission. 

Culemann and colleagues’ work adds to 
studies**” showing that macrophages are 
exquisitely adapted to the functions they 
perform in the tissues in which they reside. 
Barrier macrophages join a growing list of 
types of macrophage that shield tissues from 
damage caused by infection, inflammation 
or cancer. Tissue-resident macrophages can 
prevent neutrophil-mediated inflammatory 
damage by physically shielding damaged 
tissue from neutrophils”. Furthermore, in 
large body cavities, such as those surround- 
ing the gut, heart and lungs, specialized mac- 
rophages have been described that are thought 
to repair mechanical damage’”. These findings 
also complement the discovery of distinct 
subsets of fibroblasts, located in the sublining 
or lining regions of the joint, which, respec- 
tively, drive either inflammation or bone 
damage in arthritis’. The challenge that lies 
ahead will be to develop ways of specifically 
targeting subsets of macrophages and fibro- 
blasts with the ultimate goal of developing new 
treatments for people with arthritis. = 
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50 Years Ago 


Medical geography could soon 
benefit considerably from computer 
graphics ... Medical geography is 
concerned with variations in the 
incidence of disease in different 
areas and the link with possible 
causes connected with elements 

of the physical, biological and 
sociocultural environment. As such 
it is a topic in which maps should be 
valuable, but they are often oflittle 
use because of the time taken for 
such lengthy and repetitive processes 
as the calculation and statistical 
testing of attack rates, fatality rates, 
standardized mortality ratios and 
other disease indices. And it takes a 
long time to represent these indices 
in cartographic form. Computer 
graphics — the construction of maps 
and diagrams using the electronic 
computer — could have considerable 
potential in medical geography. They 
may, by the speed, efficiency and 
reliability of processing and mapping 
medical data, lead to a more effective 
use of maps. 

From Nature 30 August 1969 


100 Years Ago 


The Medical Research Committee 
has issued a report ... on the 
influence of alcohol on manual 
work and neuromuscular 
co-ordination. Accuracy and speed 
in typewriting and in using an 
adding machine, and accuracy in 
hitting spots on a target, were used 
as tests, and both pure alcohol and 
alcohol in the form of wine and 
spirit were employed. There was 

no distinct difference between the 
two forms of alcohol, and when 
very dilute (5 per cent.) the effect 
was about three-fourths as great as 
when taken strong (37-40 per cent.) 
for the same amount of alcohol ... 
The degree of effect depended 
largely on whether the alcohol was 
taken on an empty stomach or with 
food; on an average it was twice as 
toxic under the former condition. 
From Nature 28 August 1919 
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Superconductivity seen 
in a nickel oxide 


Magnetism alone was thought to be responsible for superconductivity in copper 
oxides. The finding of superconductivity in a non-magnetic compound that is 
structurally similar to these copper oxides challenges this view. SEE LETTER P.624 


GEORGE A. SAWATZKY 


that a lanthanum barium copper oxide, 
La,sBay,;;CuO,, becomes a super- 
conductor (has zero electrical resistance) below 
a relatively high temperature’ of 35 kelvin. 
This result triggered one of the most intense 
experimental and theoretical research efforts 
in condensed-matter physics. Soon afterwards, 
many other copper oxides (cuprates) were 
found to superconduct at temperatures” of up 
to 133.5 K. However, after more than 30 years, 
there is no consensus regarding the underlying 
mechanism of cuprate superconductivity. On 
page 624, Li et al.* report that a neodymium 
strontium nickel oxide, Ndy sSrp.,.NiO,, super- 
conducts below 9-15 K. This material has a 
similar crystal structure to that of the cuprate 
superconductors, suggesting that the authors’ 
discovery could lead to a better understanding 
of superconductivity in these systems. 
Superconductivity can occur in a metallic 
material if the usual repulsive interaction 
between electrons turns into an attractive 
one. In this scenario, the response of surround- 
ing atoms to the charge and spin (magnetic 
moment) of electrons indirectly leads to elec- 
tron pairing. At a low enough temperature, 
these paired electrons condense to form a 
superfluid (a state of matter that flows with- 
out friction), which exhibits zero electrical 
resistance’. The key to understanding super- 
conductivity in a given material is to identify 
the mechanism that provides the ‘pairing glue’ 
In the conventional mechanism, the spatial 
displacement of atoms close to an electron 
forms an attractive region for another elec- 
tron’. An analogy is that of two heavy balls on 


I: 1986, scientists unexpectedly discovered 
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a spring mattress, whereby the indentation in 
the mattress made by one of the balls produces 
an attractive region for the other ball. How- 
ever, some theoretical work has suggested that 
this effect is too small to account for the high- 
temperature superconductivity of the cuprates. 

Researchers have therefore considered that 
the spins of moving electrons might cause 
deviations in the magnetic order (the ordered 
pattern of atomic spins) in the cuprates. With 
respect to the mattress analogy, these devia- 
tions represent mattress indentations, and 
the strong interactions between the spins of 
neighbouring Cu” ions represent the mattress 
springs. To understand how this mechanism 
works, consider the cuprate superconduc- 
tor La, ;Bag ;;CuO,, which is obtained from 
the compound La,CuO, by replacing some 
lanthanum atoms with barium. 

In La,CuO,, the electrons of a particular 
Cu” ion are prevented from moving by their 
strong repulsion to the electrons of surround- 
ing Cu”* ions. As a result, the material is an 
electrical insulator®. Each Cu** ion has an odd 
number of electrons anda net spin of 1/2. The 
ions have strong antiferromagnetic order, 
which means that the spins of neighbouring 
ions point in opposite directions. 

When lanthanum in La,CuO, is partially 
replaced with barium, electron vacancies 
called holes are introduced into the system 
in a process known as doping. These holes 
migrate to the planes of CuO, in the material. 
If their density is low enough, they act as freely 
moving charge carriers, resulting in metallic 
behaviour. The combination of a Cu” ion and 
a doped hole has an even number of electrons 
and a net spin of 0, which causes a severe dis- 
turbance in the spin directions of surrounding 


Cu” ions. It is this change in the magnetic 
background associated with hole doping that 
leads to pairing. 

Over the past 30 years or so, researchers have 
looked for superconductivity in other com- 
pounds that have planes containing spin-1/2 
ions. Examples of such compounds are LaNiO, 
and NdNiO,, which comprise alternating 
planes of lanthanum or neodymium and 
NiO.,. Ni'* ions in these materials could have 
the same role in inducing superconductivity 
as do Cu”* ions in La, ,,Bay ,,;CuO,. Several 
groups have prepared LaNiO, and NdNiO, 
in both powder and thin-film form (see, for 
example, refs 6-8). However, no superconduc- 
tivity (but also no sign of magnetic order) has 
been found. 

Enter Li and colleagues. The authors grew 
a thin film of NdNiO, and then hole-doped 
this film by replacing some Nd” ions with Sr’* 
ions. They found that the resulting material, 
Nd) sSrp,NiO,, superconducts at temperatures 
of up to 15K. After some 30 years of trying, 
scientists have finally found a non-cuprate 
compound that has a cuprate-like structure 
and that exhibits superconductivity at sur- 
prisingly high temperatures. But, unlike in the 
cuprates, there is no sign of magnetic order in 
NdNiO, down to a temperature’ of 1.7K. The 
authors’ discovery might therefore indicate 
that magnetism is not exclusively responsible 
for cuprate superconductivity. 

However, this conclusion is based on the 
assumption that the cuprates and hole-doped 
NdNiO, have similar electronic structures. 
There are three reasons why this assumption 
might not be valid. First, in the cuprates, the 
holes reside mainly in the 2p electron orbit- 
als of oxygen atoms. The spins of these holes 
couple antiferromagnetically to the spins of 
neighbouring Cu” ions, producing a net spin 
of 0. By contrast, in hole-doped NdNiO,, the 
holes reside mostly in Ni’* ions and result in 
Ni” ions that, in conventional oxides, have a 
spin of 1 (ref. 9). But perhaps the situation here 
is different from that of conventional oxides. 
X-ray spectroscopy could determine whether 
this is the case, if good enough samples are 
available. 

Second, the antiferromagnetic coupling 
between spins might be substantially stronger 
in the cuprates than in NdNiO,. This differ- 
ence would be consistent with the absence 
of magnetic order in NdNiO,. And third, a 
theoretical study” suggests that 5d electron 
orbitals of lanthanum atoms in LaNiO, and of 
neodymium atoms in NdNiO, are involved in 
electrical transport. If confirmed, this result 
could change the picture completely. In par- 
ticular, local spins would be affected by being 
coupled to delocalized conducting electrons, 
as in compounds called Kondo systems". Such 
systems exhibit a minimum ina plot of resistiv- 
ity against temperature, which is observed by 
Liet al. for NdNiO,. 

There are therefore many issues to address 
before it can be concluded that the electronic 


structures of the cuprates and of hole-doped 
NdNiO, are similar. Future work should check 
that the nickel ions in NdNiO, are Ni!" ions, 
determine the local symmetry and spin of 
the hole-doped states and explore how the 
temperature at which the material becomes 
superconducting varies with hole doping. 
The chemical composition of the material 
also needs to be verified, because unwanted 
hydrides or hydroxides might have formed. 
Nevertheless, Li and colleagues’ work could 
become a game changer for our understand- 
ing of superconductivity in cuprates and 
cuprate-like systems, perhaps leading to new 
high-temperature superconductors. = 
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What makes flatworms 


go to pieces 


Flatworms called planarians can break off fragments of themselves that 
regenerate to form new, complete worms. The molecular cues that regulate 
the frequency of such fission events have been revealed. SEE LETTER P.655 


THOMAS W. HOLSTEIN 


nderstanding how tissues and organs 
| can regenerate requires an apprecia- 
tion of the mechanisms and factors 
that organize cells and tissues, both in space 
and through time. Planarian flatworms are a 
widely used model for studying such pattern 
formation because pieces of these animals 
that are cut off can regrow missing body parts 
and form complete worms. Planarians also 
have a self-scission behaviour called fission 
— they stretch and contract their tail tissue, 
which leads to detachment of parts of their 
posterior body that then grow into clones. 
Whether or not fission occurs depends on 
the size of the parent worm, but the underly- 
ing molecular and cellular processes have not 
been well understood. On page 655, Arnold 
et al.' establish a method to reliably induce 
fission in the planarian Schmidtea mediter- 
ranea, and show that cell-signalling pathways 
involving the proteins Wnt and transforming 
growth factor-B (TGF-B) are key regulators of 
this process. 

Wnt signalling has a decisive role in develop- 
ment and cell differentiation and is involved in 
many diseases’. The Wnt proteins are highly 
diverse, are found only in animals and are 
usually attached to a lipid chain and secreted 
by cells. They bind to receptor proteins of dif- 
ferent families to activate various downstream 


cell-signalling cascades that regulate the levels 
of cytoplasmic factors — molecules that con- 
trol gene expression and, thus, cell function”. 
Although our knowledge of the influence of 
Wnt signalling on tissue-pattern formation 
has advanced greatly in the past few years, how 
such patterning might be linked to specific 
tissue functions is still unknown. 

Previous studies*° in planarians have 
characterized a molecular framework in which 
self-organized gradients of Wnt proteins regu- 
late patterning along the length of the animal 
(that is, along the anterior—posterior axis), 
and in which a gradient of TGF-6 regulates 
patterning from its topside to its underside 
(along the dorsal—ventral axis). It has been 
suggested’ that planarian fission is regulated 
by gradients in metabolic activity, molecular 
positional cues or neurohormone molecules 
along the anterior—posterior body axis. One 
study indicated that fission might be inhibited 
by the front part of the nervous system’, and 
another examined the biomechanical forces 
and tissue properties that enable it to occur’. 

Unlike regeneration, which can be induced 
experimentally by cutting planarian worms 
into pieces, fission has been difficult to induce 
reliably, limiting studies on this process. How- 
ever, Arnold et al. found that transferring 
worms to cultures in which food was limited 
and water was stagnant induced fissioning in 
worms longer than about 4 or 5 millimetres 
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Figure 1 | Size-dependent fission behaviour in planarian flatworms. Planarian flatworms can 
reproduce through a process called fission. In this process, a worm breaks off a portion of tissue from the 
back end of its body, and this portion regenerates to form a complete worm. Arnold et al.' examined the 
molecular and cellular underpinnings of this fission process. They found that the frequency of fission 
events correlated with the size of the parent animal. Experimental disruptions of the expression of certain 
proteins involved in the Wnt signalling pathway (not shown), which controls tissue patterning along the 
length of planarians*’, did not affect the positioning of fission planes along the body, but did increase or 
reduce the frequency of fission events. The authors showed that Wnt signalling regulates the fine-scale 
patterning of a population of neuronal cells at the front of the worm (boxes) that inhibit fission behaviour, 
and showed that the patterning of these neurons changes with animal size. 


(Fig. 1). By analysing image recordings, the 
researchers discovered that fission events take 
about 30 minutes and result in fragments that 
are about 1 mm long, and that the frequency 
of fission events correlates with the size of the 
parent. Arnold et al. also found that, when they 
applied pressure to a cover glass placed on top 
of a worm in normal culture, the worm would 
break apart into multiple, regularly spaced frag- 
ments along its entire anterior—posterior axis. 
This suggests that, in adult worms, there are 
pre-established fission planes that scale in num- 
ber with the animal’s size, and that a hidden, 
segmented structure underlies this size control. 

Using both the starvation and compression 
methods to induce fission, the authors tested 
which molecular cues are required to induce 
size-dependent fissioning. They carried out a 
screen in which they used different RNA mol- 
ecules to selectively inhibit the expression of 
various proteins involved in patterning, includ- 
ing those in the Wnt and TGF-6 cell-signalling 
pathways**”””. These targeted disruptions 
affected fission frequency; for example, blocking 
the expression of APC, a protein that suppresses 
the Wnt signalling pathway, roughly doubled 
the frequency of sequential fission attempts in 
which the animals showed their characteris- 
tic stretching behaviour. However, interfering 
with these signalling pathways did not affect the 
positioning of fission planes along the body axis. 
Thus, Wnt and TGF- signalling seem to regu- 
late fission behaviour independently of their 
function in axial patterning. 


A previous gene-expression analysis” 
revealed that genes encoding proteins 
involved in Wnt and TGF- signalling are co- 
expressed with genes expressed by cells in the 
central nervous system (CNS). In Arnold and 
colleagues’ study, removing the front part of 
the worm that contained the cephalic ganglia 
(two clusters of neurons that together com- 
prise the planarian brain) delayed the onset of 
fission behaviour. The authors saw a similar 
effect in worms in which the expression of a 
neuronal transcription-factor protein that 
was previously shown to be required for CNS 
patterning” was suppressed. 

Arnold et al. found that a set of neuronal 
cells that are sensitive to mechanical stimuli act 
downstream of Wnt and TGF-6 signalling to 
inhibit fission behaviour. The authors demon- 
strated that Wnt and TGF-6 signalling together 
regulate the patterning of these and other spe- 
cific populations of neurons (Fig. 1). It will be 
exciting to examine how these key regulators 
of axial patterning control the fine patterning 
of the planarian nervous system’ — one of the 
big questions about the patterning of different 
types of cell is how these signalling pathways 
are integrated by progenitor cells to induce the 
generation of specific neuronal cell types. 

Although Arnold et al. focused their analysis 
on the induction of fission, even less is known 
about how the released tissue fragments form 
complete animals. For example, it is unclear 
whether these worms regenerate after fission 
in the same way that they regrow after being 
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cut into pieces. In both cases, populations of 
stem cells called neoblasts cluster to form a 
mass called a blastema at the wound site in the 
tissue fragment, which in turn can regener- 
ate different organs and tissues'*. But how the 
information concerning the position of the cut 
or fission plane is transmitted to neoblasts is 
not clear. 

Asexual reproduction through fission is a 
major strategy for increasing population size, 
not only in planarians, but also in other worm- 
like creatures (including acoels’* and other 
acoelomorph flatworms’’, and annelids’”’) in 
which fission occurs at the posterior end of 
the animal. Sea anemones can also propagate 
asexually through fission’, and budding — a 
fission-related strategy for asexual reproduc- 
tion — has been well characterized in the 
freshwater animal Hydra” and is strongly 
related to regeneration”. 

Detailed investigation of fission and 
budding in different model organisms will be 
important because, in these processes, pattern 
formation is induced without injury, and there- 
fore might be different from regeneration after 
injury. If the processes that enable regeneration 
in planarians after fission and after cutting are 
indeed the same, future research should deter- 
mine the mechanisms that compensate for the 
lack of an injury signal in fissioning tissue. 
Such research will be crucial for understand- 
ing howinjury and patterning signals converge 
to initiate the regeneration process. = 
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Modern microprocessor built from 
complementary carbon nanotube transistors 


Gage Hills!?, Christian Lau!*, Andrew Wright!, Samuel Fuller?, Mindy D. Bishop!, Tathagata Srimani', Pritpal Kanhaiyal, 
Rebecca Ho!, Aya Amer’, Yosi Stein’, Denis Murphy’, Arvind!, Anantha Chandrakasan! & Max M. Shulaker!* 


Electronics is approaching a major paradigm shift because silicon transistor scaling no longer yields historical energy- 
efficiency benefits, spurring research towards beyond-silicon nanotechnologies. In particular, carbon nanotube field- 
effect transistor (CNFET) -based digital circuits promise substantial energy efficiency benefits, but the inability to 
perfectly control intrinsic nanoscale defects and variability in carbon nanotubes has precluded the realization of very- 
large-scale integrated systems. Here we overcome these challenges to demonstrate a beyond-silicon microprocessor built 
entirely from CNFETs. This 16-bit microprocessor is based on the RISC-V instruction set, runs standard 32-bit instructions 
on 16-bit data and addresses, comprises more than 14,000 complementary metal-oxide-semiconductor CNFETs and is 
designed and fabricated using industry-standard design flows and processes. We propose a manufacturing methodology 
for carbon nanotubes, a set of combined processing and design techniques for overcoming nanoscale imperfections at 
macroscopic scales across full wafer substrates. This work experimentally validates a promising path towards practical 


beyond-silicon electronic systems. 


With diminishing returns of silicon field-effect transistor (FET) scal- 
ing, the need for FETs leveraging nanotechnologies has been stead- 
ily increasing. Carbon nanotubes (CNTs, nanoscale cylinders made 
of a single sheet of carbon atoms with diameters of approximately 
10-20 A) are prominent among a variety of nanotechnologies that are 
being considered for next-generation energy-efficient electronic sys- 
tems**, Owing to the nanoscale dimensions and simultaneously high 
carrier transport of CNTs>*, digital systems built from FETs fabricated 
with CNTs as the transistor channel (that is, CNFETs) are projected to 
improve the energy efficiency of today’s silicon-based technologies by 
an order of magnitude*”*. 

Over the past decade, CNT technology has matured: from single 
CNFETs?’ to individual digital logic gates'®!! to small-scale digital cir- 
cuits and systems”'?"!®. In 2013, this progress led to the demonstration 
of a complete digital system: a miniature computer comprising 178 
CNFETs that implemented only a single instruction operating on only 
a single bit of data (see Supplementary Information for a full discussion 
of previous work). However, as with all emerging nanotechnologies, 
there remained a substantial disconnect between these small-scale 
demonstrations and modern systems comprising tens of thousands 
of FETs (for example, microprocessors) to billions of FETs (for exam- 
ple, high-performance computing servers). Perpetuating this divide is 
the inability to achieve perfect atomic-level control of nanomaterials 
at macroscopic scales (for example, yielding CNTs of consistent 10-A 
diameter uniformly across industry-standard wafer substrates of diam- 
eter 150-300 mm). The resulting intrinsic defects and variations have 
made the realization of such modern systems infeasible. For CNTs, 
there are three major intrinsic challenges: material defects, manufac- 
turing defects and variability. 

(1) Material defects. Although semiconducting CNTs form energy- 
efficient FET channels, the inability to precisely control CNT diameter 
and chirality results in every CNT synthesis containing some percent- 
age of metallic CNTs. Metallic CNTs have little to no bandgap and 
therefore their conductance cannot be sufficiently modulated by the 


CNFET gate, resulting in high leakage current and potentially incorrect 
logic functionality!”. 

(2) Manufacturing defects. During wafer fabrication, CNTs inherently 
‘bundle’ together, forming thick CNT aggregates'*!°. These aggre- 
gates result in CNFET failure (reducing CNFET circuit yield), as well 
as prohibitively high particle contamination rates for very-large-scale 
integration (VLSI) manufacturing. 

(3) Variability. Energy-efficient complementary metal-oxide- 
semiconductor (CMOS)”? digital logic requires the ability to fabricate 
CNFETs of complementary polarities (p-CNFETs and n-CNFETs) 
with well-controlled characteristics (for example, tunable and uni- 
form threshold voltages, and p- and n-CNFETs with matching on- 
and off-state current). Previous techniques for realizing CNT CMOS 
have relied on either extremely reactive, non-air-stable, non-silicon 
CMOS-compatible materials*!-?° or have lacked tunability, robustness 
and reproducibility**®. This has severely limited the complexity of CNT 
CMOS demonstrations (a complete CNT CMOS digital system has not 
yet been fabricated). 

Although much previous work has focused on overcoming these 
challenges, none meets all of the strict requirements for realizing VLSI 
systems. In this work, we overcome the intrinsic CNT defects and varia- 
tions to enable a demonstration of a beyond-silicon modern micropro- 
cessor: RV16X-NANO, designed and fabricated entirely using CNFETs. 
RV16X-NANDO is a 16-bit microprocessor based on the open-source 
and commercially available RISC-V instruction set processor, running 
standard RISC-V 32-bit instructions on 16-bit data and addresses. It 
integrates >14,000 CMOS CNFETs, and operates as modern micro- 
processors do today (for example, it can run compiled programs; in 
addition, we demonstrate its functionality by executing all types and 
formats of instructions in the RISC-V instruction-set architecture). 
This is made possible by our manufacturing methodology for CNTs 
(MMC) —a set of original processing and circuit design techniques 
that are combined to overcome the intrinsic CNT challenges. The key 
elements of MMC are: 
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Fig. 1 | RV16X-NANO. a, Image of a fabricated RV16X-NANO chip. The 
die area is 6.912 mm x 6.912 mm, with input/output pads placed around 


the periphery. Scanning electron microscopy images with increasing 


magnification are shown below (one image is false-coloured to match the 
colouring in the schematic in b). RV16X-NANO is fabricated entirely from 
CNFET CMOS, in a wafer-scalable, VLSI-compatible, and silicon-CMOS 


(1) RINSE (removal of incubated nanotubes through selective exfo- 
liation). We propose a method of removing CNT aggregate defects 
through a selective mechanical exfoliation process. RINSE reduces CNT 
aggregate defect density by >250x without affecting non-aggregated 


CNTs or degrading CNFET performance. 


compatible fashion. b, Three-dimensional to-scale rendered schematic 
of the RV16X-NANO physical layout (all dimensions are to scale except 


for the z axis, which is magnified to clarify each individual vertical 


layer). RV16X-NANO leverages a new three-dimensional (3D) physical 
architecture in which the CNFETs are physically located in the middle of 


the stack, with metal routing both above and below. 


(2) MIXED (metal interface engineering crossed with electrostatic 
doping). Our combined CNT doping process leverages both metal con- 
tact work function engineering as well as electrostatic doping to realize 
a robust wafer-scale CNFET CMOS process. We experimentally yield 


entire dies with >10,000 CNFET CMOS digital logic gates ( 
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Fig. 2 | Architecture and design of RV16X-NANO. a, Block diagram 
showing the organization of RV16X-NANO, including the instruction 
fetch, instruction decode, register read, execute + memory access, and 


write-back stages. See Supplementary Information section “RISC-V: 
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Operational Details’ for definitions of terms. b, Schematics describing the 
high-level register transfer level (RTL) description of each stage, including 
inputs, outputs and signal connections. Additional information on the 


RV16X-NANO is in the Supplementary Information. 
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Fig. 3 | RV16X-NANO experimental results. a, Experimentally measured 
waveform from RV16X-NANO, executing the famous ‘Hello, World’ 
program. The waveform shows the 32-bit instruction fetched from 
memory, the program counter stored in RV16X-NANO, as well as the 
character output from RV16X-NANO. Below the waveform, we convert 
the binary output (shown in red in hexadecimal code) to their ASCII 
characters to their ASCII characters, showing RV16X-NANO printing out 
“Hello, world! Iam RV16XNano, made from CNTs.” In addition to this 
program, we test functionality by executing all of the 31 instructions within 
RV32E (see Supplementary Information). b, RV16X-NANO is designed 
using conventional electronic design automation (EDA) tools, leveraging 
our CNT process design kit and CNT CMOS standard cell library. An 
example combinational cell (full-adder) and example sequential cell 
(D-flip-flop) are shown alongside an optical microscopy image of the 
fabricated cells, their schematics, as well as their experimentally measured 
waveforms. For the full-adder, we show the outputs (sum and carry-out 


‘not-or gates with functional yield 14,400/14,400, comprising 57,600 
total CNFETs), and present a wafer-scale CNFET CMOS uniformity 
characterization across 150-mm wafers (such as analysing the yield 
for more than 100 million possible combinations of cascaded logic 
gate pairs). 

(3) DREAM (designing resiliency against metallic CNTs). This tech- 
nique overcomes the presence of metallic CNTs entirely through 
circuit design. DREAM relaxes the requirement on metallic CNT 
purity by about 10,000 x (relaxed from a semiconducting CNT purity 
requirement of 99.999999% to 99.99%), without imposing any addi- 
tional processing steps or redundancy. DREAM is implemented using 
standard electronic design automation (EDA) tools, has minimal cost, 
and enables digital VLSI systems with CNT purities that are available 
commercially today. 


t (ms) 


outputs) for all possible biasing conditions in which sweeping the voltage of 
input (from 0 to Vpp) causes a change in the logical state of the output (that 
is, for the full adder, with Coyy = A*B + B* Cy + A*Chy, with A = logical ‘0’ 
and B = logical ‘1; then sweeping Cyn from ‘0’ to ‘1’ causes Cour to change 
from logical ‘0’ to logical ‘1’). (CI indicates Cyy and CO indicates Cour.) For 
the sum output S(Vour), there are 12 such conditions: six where Vour has 
the same polarity as the swept input (positive unate) and six where Vour has 
the opposite polarity to the swept input (negative unate). For the carry-out 
output C(Vour) there are six such conditions (all positive unate); the 
measurements are overlaid over one another in b). Gain for all transitions is 
>15, with output voltage swing >99%. The D-flip-flop waveform (voltage 
versus time) illustrates correct functionality of the positive edge-triggered 
D-flip-flop (output state Q shows correct functionality based on data input 
D and clock input CLK). CK and CK are the clock input and the inverse of 
the clock input, respectively. 


Importantly, the entire MMC is wafer-scale, VLSI-compatible 
and is seamlessly integrated within existing infrastructures for sili- 
con CMOS—both in terms of design and of processing. Specifically, 
RV16X-NANO is designed with standard EDA tools, and leverages 
only materials and processes that are compatible with and exist 
within commercial silicon CMOS manufacturing facilities. Together, 
these contributions establish a robust CNT CMOS technology and 
represent a major milestone in the development of beyond-silicon 
electronics. 


RV16X-NANO 

Figure 1 shows an optical microscopy image of a fabricated RV16X- 
NANO die alongside three-dimensional to-scale rendered schematics 
of the physical layout. It is the largest CMOS electronic system 
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Fig. 4 | MMC. a, Design and manufacturing flow for RV16X-NANO, 
illustrating how MMC seamlessly integrates within conventional silicon- 
based EDA tools. Black boxes show conventional steps in silicon-CMOS 
design flows. Blue text indicates steps that are adjusted for CNTs instead 
of silicon, and red text represents the additions needed to implement the 
MMC. RV16X-NANO is the first hardware demonstration of a beyond- 
silicon emerging nanotechnology leveraging a complete RTL-to-GDS 
physical design flow that uses only conventional EDA tools. Software 
packages are from Synopsys (https://www.synopsys.com/), Cadence 
(https://www.cadence.com/) and Mentor Graphics (https://www.mentor. 
com/). b, RINSE. As shown in the scanning electron microscopy images, 
CNTs inherently bundle together, forming thick CNT aggregates. These 
aggregates result in CNFET failure (reduced CNFET yield) as well as 
prohibitive particle contamination for VLSI manufacturing. c, The RINSE 


realized using beyond-silicon nanotechnologies: comprising 3,762 
CMOS digital logic stages, totalling 14,702 CNFETs containing more 
than 10 million CNTs, and includes logic paths comprising up to 
86 stages of cascaded logic between flip-flops (that is, that must evaluate 
sequentially in a single clock cycle). It operates with supply voltage 
(Vpp) of 1.8 V, receives an external referenced clock (generating local 
clock signals internally), receives inputs (instructions and data) from 
and writes directly to an off-chip main memory (dynamic random- 
access memory, DRAM), and stores data on-chip in a register file. No 
other external biasing or control signals are supplied. Furthermore, 
RV16X-NANO has a three-dimensional (3D) physical architecture, 
as the metal interconnect layers are fabricated both above and below 
the layer of CNFETs; this is in contrast to silicon-based systems 
in which all metal routing can only be fabricated above the bottom 
layer of silicon FETs (see Methods). In RV16X-NANO, the metal 
layers below the CNFETs are primarily used for signal routing, while 
the metal layers above the CNFETs are primarily used for power 
distribution (Fig. 1c, d). The fabrication process implements five 
metal layers and includes more than 100 individual processing steps 
(see Methods and section ‘MMC’ for details). This 3D layout, with 
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process steps: (1) CNT incubation, (2) adhesion coating, (3) mechanical 
exfoliation (see text for details). d, e, RINSE results. After performing 
RINSE, CNT aggregates are removed from the wafer (as shown in d). 
Importantly, the individual CNTs not in aggregates are not removed 
from the wafer, while without RINSE, sonication inadvertently removes 
large areas of all CNTs from the wafer (in e, where the top shows CNT 
incubation pre-RINSE, the middle shows CNTs left on the wafer post- 
RINSE, and the bottom shows CNTs inadvertently removed from the wafer 
after sonicating a wafer to remove CNT aggregates without performing 
the critical adhesion-coating step in RINSE). f, Particle contamination 
reduction due to RINSE: RINSE decreases particle density by >250x. 

g, Ideally, individual CNTs are not inadvertently removed during RINSE; 
increasing the time of step 3 (sonication time) to over 7 h results in no 
change in CNT density across the wafer. 


routing above and below the FETs promises improved routing congestion 
(a major challenge for today’s systems’), and is uniquely enabled by 
CNTs (owing to their low-temperature fabrication; see Methods). 


Physical design 

The design flow of RV16X-NANO leverages only industry-standard 
tools and techniques: we create a standard process design kit (PDK) for 
CNFETs as well as a library of standard cells for CNFETs that is compat- 
ible with existing EDA tools and infrastructure without modification. 
Our CNFET process design kit includes a compact model for circuit 
simulations that is experimentally calibrated to our fabricated CNFETs. 
The standard cell library comprises 63 unique cells, and includes both 
combinational and sequential circuit elements implemented with both 
static CMOS and complementary transmission-gate digital logic circuit 
topologies (see Supplementary Information for a full list of standard 
library cells, including circuit schematics and physical layouts). We use 
the CNFET process design kit to characterize the timing and power for 
all of the library cells, which we experimentally validate by fabricating 
and measuring all cells individually (see Supplementary Information 
for full description and experimental characterization of the standard 
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Fig. 5 | MIXED. a, Schematic of CNFET CMOS fabricated using MIXED. 
MIXED is a combined doping process that leverages both metal contact 
work-function engineering as well as electrostatic doping to realize a 
robust wafer-scale CNFET CMOS process. We use platinum contacts 

and SiO, passivation for p-CNFETs, and titanium contacts and HfO, 
passivation for n-CNFETs (see Methods for details). To characterize 
MIXED, we fabricated dies with 10,400 CNFET CMOS digital logic gates 
across 150-mm wafers (b). c, d, Experimental results. c, Ip versus Vps 
characteristics showing p-CNFETs and n-CNFETs that exhibit similar 
Ip-Vps characteristics (for opposite polarity of input bias conditions, 

for example, Vps,p = — Vps,n), achieved with MIXED. The gate-to- 

source voltage Vas is swept from —Vpp to Vpp in increments of 0.1 V. 

See Supplementary Information for Ip-Vgs and additional CNFET 
characteristics. d, Output voltage transfer curves (VTCs, Vout vs Vin) 

for all 10,400 CNT CMOS logic gates (nor2) within a single die. Each 
VTC illustrates Vour as a function of the input voltage of one input 

(Vj), while the other input is held constant. For each nor2 logic gate 
(with logical function OUT = !(INa|INg), we measure the VTC for each 
of two cases: Vout versus Vin,a with Vins =0Vand Vout versus Ving 
with Vin, = 0 V). All 10,400/10,400 exhibit correct functionality (which 
we define as having output voltage swing >70%). The black dotted line 
represents the average VTC (average Vjn across all measured VTC for each 
value of Vout), while the red dotted line represents the boundary of +3 
standard deviations (again, across all Vij values for each value of Vout). 
See Supplementary Information for extracted distributions of key metrics 
from these experimental measurements (gain, output voltage swing and 
SNM analysing >100 million possible cascaded logic gates pairs formed 
from these 10,400 samples), as well as uniformity characterization across 
the 150-mm wafer. Importantly, despite the high yield and robust CNFET 
CMOS enabled by MIXED and RINSE, we note that there are outlier 
gates with degraded output swing (the blue lines in d). These outliers are 
caused by CNT CMOS logic gates that contain metallic CNTs; the third 
component of the MMC (DREAM; see Fig. 6), is a design technique that is 
essential for overcoming the presence of these metallic CNTs. 


cell library). A full description of our industry-practice VLSI design 
methodology, including how we implement DREAM during logic syn- 
thesis and place-and-route, is provided in the Methods. 


Computer architecture 

Figure 2 illustrates the architecture of RV16X-NANO, which fol- 
lows conventional microprocessor design (implementing instruction 
fetch, instruction decode, register read, execute/memory access, 
and write-back stages). It is designed from RISC-V, a standard open 
instruction-set architecture used in commercial products today and 
gaining widespread popularity in both academia and industry*®”*; 
see https://riscv.org/wp-content/uploads/2017/05/Tue1345pm- 
NVIDIA-Sijstermans.pdf and https://www.westerndigital.com/ 
company/innovations/risc-v). RV16X-NANO is derived from a full 
32-bit RISC-V microprocessor supporting the RV32E instruction set 
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(31 different 32-bit instructions, see Supplementary Information), 
while truncating the data path width from 32 bits to 16 bits, and reduc- 
ing the number of registers from 16 to 4. It is designed using the pub- 
licly available software Bluespec (https://bluespec.com/), and is verified 
using a Satisfiability Modulo Theories (SMT)-based bounded model 
checking against a formal specification of the RISC-V instruction-set 
architecture (see Supplementary Information). To demonstrate the cor- 
rect functionality of the microprocessor, we experimentally run and 
validate correct functionality of all types and formats of instructions 
on the fabricated RV16X-NANO. Figure 3 shows the first program 
executed on RV16X-NANO: the famous ‘Hello, World’ See Methods 
and Supplementary Information for schematics, operational details and 
experimental measurements. 


MMC 

Here we describe our MMC—a set of combined processing and 
design techniques that are the foundation for enabling the realization 
of RV16X-NANO (Fig. 4a). All design and fabrication processes are 
wafer-scale and VLSI-compatible, not requiring any per-unit custom- 
ization or redundancy. 


RINSE 

The CNFET fabrication process begins by depositing CNTs uniformly 
over the wafer. 150-mm-diameter wafers (with the bottom metal sig- 
nal routing layers and gate stack of the CNFET already fabricated for 
the 3D design) are submerged in solutions containing dispersed CNTs 
(Methods). Although CNTs are uniformly deposited over the wafer, 
the CNT deposition also inherently results in manufacturing defects: 
CNT aggregates deposited randomly across the wafer (Fig. 4b). These 
CNT aggregates act as particle contamination, reducing die yield. 
Several existing techniques have attempted to remove these aggregates 
before CNT deposition, but none is sufficient to meet wafer-level yield 
requirements for VLSI systems: (1) excessive high-power sonication 
for dispersing aggregates in solution damages CNTs, which results in 
degraded CNFET performance and does not disperse all CNTs; (2) cen- 
trifugation, which does not remove all smaller aggregates (and aggre- 
gates can re-form post-centrifugation), (3) excessive filtering, which 
removes both aggregates and the CNTs themselves from the solution, 
and (4) etching the aggregates, which is not feasible owing to lack of 
selectivity versus the underlying CNTs themselves. Instead, to remove 
these aggregates, we developed a process that we call RINSE, consisting 
of three steps (Fig. 4c): 

(1) CNT incubation. Solution-based CNTs are deposited on wafers 
pre-treated with a CNT adhesion promoter (hexamethyldisilazane, 
bis(trimethylsilyl)amine). 

(2) Adhesion coating. A standard photoresist (polymethylglutarimide) 
is spin-coated onto the wafer and cured at about 200°C. 

(3) Mechanical exfoliation. The wafer is placed in solvent (N- 
methylpyrrolidone) and sonicated. 

The key to RINSE is the adhesion coating (step 2): without it, soni- 
cating the wafer inadvertently removes sections of CNTs in addition to 
the aggregates (Fig. 4d). The adhesion coating leaves an atomic layer of 
carbon that remains after step 3, which exerts sufficient force to adhere 
the CNTs to the wafer surface while still allowing for the removal of the 
aggregates. Experimental results for RINSE are shown in Fig. 4d-g; by 
optimizing the adhesion-coating cure temperature and time as well as 
the sonication power and time, RINSE reduces the CNT aggregate den- 
sity by >250 x (quantified by the number of CNT aggregates per unit 
area) without damaging the CNTs or affecting CNFET performance 
(see Supplementary Information). 


MIXED 

After using RINSE to overcome intrinsic CNT manufacturing defects, 
CNFET circuit fabrication continues. Unfortunately, while energy- 
efficient CMOS logic requires both p-CNFETs and n-CNFETs with 
controlled and tunable properties (such as threshold voltage), tech- 
niques for realizing CNT CMOS today result in large FET-to-FET 
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Fig. 6 | DREAM. DREAM overcomes the presence of metallic CNTs 
entirely through circuit design, and is the final component of the MMC. 
DREAM relaxes the requirement on metallic CNT purity by about 

10,000 x, without imposing any additional processing steps or redundancy. 
DREAM is implemented using standard EDA tools, has minimal cost 
(<10% energy, < 10% delay and < 20% area), and enables digital 

VLSI systems with CNT purities that are available commercially today 
(99.99% semiconducting CNT purity). a, VTCs for driving logic stages 
and mirrored VTCs for loading logic stages, showing SNM simulated 

for 4 different logic stage pairs (SNM is defined in the Supplementary 
Information), with up to two metallic CNTs in all CNFETs. The logic stage 
pairs: (nand2, nand2) and (nor2, nor2) have better SNM than do (nand2, 
nor2) and (nor2, nand2) despite all logic stages having exactly the same 
VTCs. We note that we distinguish logic stages (for example, an inverter) 
from logic gates (for example, a buffer, by cascading two inverters); a 

logic gate can comprise multiple logic stages. b, Example DREAM SNM 
table (see Methods for details, analysed for a projected 7-nm node with 

a scaled Vpp of 500 mV), which shows the minimum SNM for each 

pair of connected logic stages. As an example, values less than 83 mV 

are highlighted in red, indicating that these combinations would not be 


variability that has made the realization of large-scale CNFET CMOS 
systems infeasible. Moreover, the vast majority of existing techniques 
are not air-stable (for example, they use materials that are extremely 
reactive in air”’), are not uniform or robust (for example, they do not 
always successfully realize CMOS”), or rely on materials not compati- 
ble with conventional silicon CMOS processing (for example, molecu- 
lar dopants that contain ionic salts prohibited in commercial fabrication 
facilities?*?°). 

These challenges are overcome by our processing technique, MIXED, 
described in Fig. 5. The key to MIXED is a combined doping approach 
that engineers both the oxide deposited over the CNTs to encapsulate 
the CNFET as well as the metal contact to the CNTs”. First, we encap- 
sulate the CNFETs in oxide (deposited by atomic-layer deposition) 
to isolate them from their surroundings. By leveraging the atomic- 
layer control of atomic-layer deposition, we also engineer the precise 
stoichiometry of this oxide encapsulating the CNTs, which enables us 
to simultaneously electrostatically dope the CNTs (the stoichiometry 
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permitted during design, to reduce overall susceptibility to noise at the 
VLSI circuit level. c, Yield (pus) versus semiconducting CNT purity for a 
required SNM level (SNMa) of SNMg = Vpp/5, shown for the OpenSparc 
‘dec’ module designed using the 7-nm node CNFET standard library cells 
derived from the ASAP7 process design kit with a scaled Vpp of 500 mV 
(details in Methods). d, Fabricated CNT CMOS die, comprising 1,000 
NMOS CNFETs and 1,000 PMOS CNFETs. Semiconducting CNT purity 
is ps © 99.99%, with around 15-25 CNTs per CNFET. e, f, Experimental 
demonstration of DREAM. VTCs for nand2 and nor2 generated by 
randomly selecting two NMOS and two PMOS CNFETs from d (some of 
which contain metallic CNTs). This is repeated to form 1,000 unique nor2 
and nand2 VTCs. We then analyse the SNM for over one million logic 
stage pairs (shown in f), corresponding to all combinations of 1,000 VTCs 
for the driving logic stage and 1,000 VTCs for the loading logic stage. 

e, A subset of these logic stage pairs; the (nor2, nor2) maintains minimum 
SNM > 0, while (nand2, nor2) suffers from minimum SNM < 0 in the 
presence of metallic CNTs; >99.99% of (nor2, nor2) and (nand2, nand2) 
logic stage pairs achieve SNM > 0 V, while only about 97% of (nand2, 
nor2) achieve SNM > 0 V. f, Cumulative distributions of SNM over one 
million logic stage pairs. 


dictates both the amount of redox reaction at the oxide-CNT inter- 
face and the fixed charge in the oxide). In addition, we engineer the 
metal source/drain contacts to the CNTs to further optimize the 
p- and n-CNFETs. We use a lower-work-function metal (titanium) for 
the contacts to n-CNFETs and a higher-work-function metal for the 
contacts to p-CNFETs (platinum), improving the on-state drive current 
of both (for a given off-state leakage current). In contrast to previous 
approaches, MIXED has the following key advantages: it leverages only 
silicon CMOS-compatible materials, it allows for precise threshold volt- 
age tuning through controlling the stoichiometry of the atomic-layer 
deposition doping oxide, and it is robust owing to tight process control 
by using atomic-layer deposition and only air-stable materials. 
Figure 5c shows the current-voltage (I-V) characteristics of 
p-CNFETs and n-CNFETs, demonstrating well-matched characteris- 
tics (such as on- and off-state currents). To demonstrate the reproduc- 
ibility of MIXED at the wafer scale, Fig. 5 shows measurements from 
10,400/10,400 correctly functioning 2-input ‘not-or’ (nor2) CNFET 


logic gates within a single die, and 1,000/1,000 correctly functioning 
nor2 gates randomly selected from across a 150-mm wafer. Additional 
characterization results (including output voltage swing, gain, and SNM 
for >100 million possible combinations of cascaded logic gate pairs), are 
in Supplementary Information. This demonstrates solid-state, air-stable, 
VLSI- and silicon-CMOS compatible CNFET CMOS at the wafer scale. 


DREAM 

Despite the robust CNFET CMOS enabled by RINSE and MIXED, a small 
percentage (around 0.01%) of CNTs are metallic CNTs. Unfortunately, a 
metallic CNT fraction of 0.01% can be prohibitively large for VLSI-scale 
systems, owing to two major challenges—increased leakage power, which 
degrades energy-delay product (EDP) benefits, and degraded noise 
immunity, which potentially results in incorrect logic functionality. To 
quantify the noise immunity of digital logic, we extract the static noise 
margin (SNM) for each pair of connected logic stages, using the voltage 
transfer curves (VTCs) of each stage (details in Extended Data Fig. 8). 
The probability that all connected logic stages meet a minimum SNM 
requirement (SNMa, typically chosen by the designer as a fraction of 
Vpp; for example, SNMg = Vpp/4) is pos: the probability that all noise 
margin constraints are satisfied (Methods). Although previous works 
have set requirements on semiconducting-CNT purity (ps) based on lim- 
iting metallic-CNT-induced leakage power, no existing works have pro- 
vided VLSI circuit-level guidelines for ps based on both increased leakage 
and the resulting degraded SNM. Although ps of 99.999% is sufficient to 
limit EDP degradation to <5%, SNM imposes far stricter requirements 
on purity: ps must be about 99.999999% to achieve pyms > 99% (analysed 
for 1 million gate circuits, Supplementary Information). 

Unfortunately, typical CNT synthesis today achieves a ps value of only 
about 66%. While many different techniques have been proposed to 
overcome the presence of metallic CNTs (Supplementary Information), 
the highest reported purity is a ps of about 99.99%: this is 10,000 x 
below the requirement for VLSI circuits*!-*3, Moreover, these tech- 
niques have substantial cost, requiring either additional processing 
steps (for example, applying high voltages for electrical ‘breakdown’ 
of metallic CNTs during fabrication!®) or redundancy (incurring sub- 
stantial energy-efficiency penalties™). Here we present and experimen- 
tally validate a new technique, DREAM, that overcomes the presence of 
metallic CNTs entirely through circuit design. The key contribution of 
DREAM is that it reduces the required ps by around 10,000 x, allowing 
99% pros With ps = 99.99% (for circuits with one million logic gates). 
This enables digital VLSI circuits to use CNT processing available today: 
Ps = 99.99% is already commercially available (and can also be achieved 
through several means, including solution-based sorting, which we use 
in our process for fabricating RV16X-NANO; see Methods). 

The key insight for DREAM is that metallic CNTs affect different 
pairs of logic stages uniquely depending on how the logic stages are 
implemented (considering both the schematic and physical layout). 
As a result, the SNM of specific combinations of logic stages is more 
susceptible to metallic CNTs. To improve overall pymg for a digital VLSI 
circuit, DREAM applies a logic transformation during logic synthesis to 
achieve the same circuit functionality, while prohibiting the use of spe- 
cific logic stage pairs whose SNM is most susceptible to metallic CNTs. 
As an example, let (Gp, Gy) be a logic stage pair with driving logic stage 
Gp and loading logic stage G,. Figure 6 shows that some logic stage 
pairs have better SNM in the presence of metallic CNTs than others, 
despite using exactly the same VTCs for the logic stages comprising 
the circuit (in this instance, logic stage pairs (nand2, nand2) and (nor2, 
nor2) have better SNM than (nand2, nor2) or (nor2, nand2)). Thus, a 
designer can improve pyms by prohibiting the use of logic stage pairs 
that are more susceptible to metallic CNTs, while permitting logic stage 
pairs that maintain better SNM despite the presence of metallic CNTs. 

Beyond this simple example to illustrate DREAM, we also quan- 
tify the benefit of DREAM using both simulation and experimental 
analysis for VLSI-scale circuits; in simulation, we leverage a compact 
model for CNFETs (derived from ref. *), which accounts for both 
semiconducting CNTs and metallic CNTs, to analyse the effect of 
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metallic CNTs on the leakage power, energy consumption, speed and 
noise susceptibility of physical designs of VLSI-scale circuits at a 7-nm 
technology node designed using standard EDA tools, with and with- 
out DREAM (results are shown in Fig. 6; see additional discussion 
in Supplementary Information). Experimentally, we fabricate and char- 
acterize 2,000 CMOS CNFETs fabricated with MIXED (1,000 p-type 
metal-oxide-semiconductor (PMOS) and 1,000 n-type metal-oxide- 
semiconductor (NMOS) CNFETs; see Fig. 6). Using I- V measurements 
from these 2,000 CNFETs, we analyse one million combinations of 
CNFET digital logic gates (whose electrical characteristics are solved 
using the J-V characteristics of the measured CNFETs; Extended Data 
Fig. 8) to show the benefits of DREAM in reducing circuit susceptibility 
to noise. In the Methods, we provide extensive details of these analyses 
and the implementation of DREAM for arbitrary digital VLSI circuits, 
including how to implement DREAM using standard industry-practice 
physical design flows, how we implement DREAM for RV16X-NANO, 
and an efficient algorithm to satisfy target pms constraints (such as 
Pwos = 99%), while minimizing energy, delay and area costs. 


Outlook 

These combined processing and design techniques overcome the major 
intrinsic CNT challenges. Our complete manufacturing methodology 
for CNTs (MMC) enables a demonstration of a beyond-silicon modern 
microprocessor fabricated from CNTs, RV16X-NANO. In addition 
to demonstrating the RV16X-NANO microprocessor, we thoroughly 
characterize and analyse all facets of MMC, illustrating the feasibility 
of our approach and more broadly of a future CNT technology. This 
work is a major advance for CNTs, paving the way for next-generation 
beyond-silicon electronic systems. 
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METHODS 


Fabrication process. The fabrication process is shown in Extended Data Fig. 1, 
and a final fabricated 150-mm wafer is shown in Extended Data Fig. 4. It uses five 
metal layers and over 100 individual processing steps. 

Bottom metal routing layers. The starting substrate is a 150-mm silicon wafer with 
800-nm-thick thermal oxide for isolation. The bottom metal wire layers are defined 
using conventional processing (for example, lithographic patterning, metal depo- 
sition, etching, and so on). After the first metal layer is patterned (Extended Data 
Fig. 1a), an oxide spacer (300°C) is deposited to separate this first metal layer from 
the subsequent second metal layer (Extended Data Fig. 1b). To produce interlayer 
vias between the first and second metal layer, vias are lithographically patterned and 
etched through this spacer dielectric using dry reactive ion etching (RIE) that stops 
on the bottom metal layer (Extended Data Fig. 1c). The second metal layer is then 
defined lithographically and deposited. The vias are formed simultaneously with 
the second metal wire layer, because the vias are filled during the metal deposition 
(Extended Data Fig. 1d). RV16X-NANO has two bottom metal layers, which are used 
for signal routing. The second metal layer also acts as the bottom gate for the CNFETs. 
Bottom gate CNFETs. The second metal layer (Extended Data Fig. 1d) provides 
both signal routing (local interconnect) as well as the bottom gate for the CNFETs. 
To fabricate the remaining bottom gate CNFET structure, a high-k (k is the dielec- 
tric constant) gate dielectric (a dual-stack of AlO2 and HfO>.) is deposited through 
atomic layer deposition (at 300 °C) over the bottom metal gates (Extended Data 
Fig. le). The HfO, is used for the majority of the dielectric stack owing to its high-k 
dielectric constant, while the AlO, is used for its improved seeding and increased 
dielectric breakdown voltage. Following gate dielectric deposition, contact vias 
through the gate dielectric are patterned, and again RIE is used to etch the contact 
vias, stopping on the local bottom gates (Extended Data Fig. 1f). These contact 
vias are used by the top metal wiring to contact and route to the bottom gates and 
bottom metal routing layers. Post-etch, the surface is cleaned with both a solvent 
rinse as well as oxygen plasma, in preparation for the CNT deposition. Before CNT 
deposition, the surface is treated with hexamethyldisilazane, a common photoresist 
adhesion promoter, which improves the CNT deposition (both density and uni- 
formity) over the high-k gate dielectric. The 150-mm wafer is then submerged in 
a toluene-based solution of purified CNTs (similar to the commercial Isosol-100 
available from NanolIntegris; http://nanointegris.com/), containing approximately 
99.99% semiconducting-CNTs. The amount of time the wafer incubates in the 
solution, as well as the concentration of the CNT solution, both affect the final 
CNT density; this process is optimized to achieve approximately 40-60 CNTs per 
linear micrometre (Extended Data Fig. 1g). Immediately before CNT incubation, 
the CNT solution is diluted to the target concentration and is horn-sonicated 
briefly to maximize CNT suspension (importantly, some CNT aggregates will 
always remain). Post-CNT deposition, we perform the RINSE method (the first 
step of our MMC) to remove CNT aggregates that deposit on the wafer, leaving 
CNTs uniformly deposited across the 150-mm wafer. Importantly, RINSE does not 
degrade the remaining CNTs or remove the non-aggregated CNTs on the wafer 
(Extended Data Fig. 5). After CNT incubation, we perform the CNT active etch in 
order to remove CNTs outside the active region of the CNFETs (that is, the channel 
region of the CNFETs). To do so, we lithographically pattern the active region 
of the CNFETs (protecting CNTs in these regions with photoresist), and etch all 
CNTs outside these regions in oxygen plasma. The photoresist is then stripped in 
a solvent rinse, leaving CNTs patterned only in the intended locations (that is, in 
the channel regions of the CNFETs) on the wafer (Extended Data Fig. 1h). We use 
solution-based CNTs here, but an alternative method for depositing CNTs on the 
substrate is aligned growth of CNTs on a crystalline substrate followed by transfer 
of the CNTs onto the wafer used for circuit fabrication; both methods have shown 
the ability to achieve high-drive-current CNFETs™”’. 

MIXED method for CNT CMOS. After the active etch of the CNTs (described in the 
paragraph above), the p-CNFET source and drain metal contacts are lithograph- 
ically patterned and defined. We deposit the p-CNFET contacts (0.6-nm-thick 
titanium for adhesion followed by 85-nm-thick platinum) using electron-beam 
evaporation, and the contacts are patterned through a dual-layer lift-off process 
(Extended Data Fig. 1i). This third metal layer acts as both the p-CNFET source 
contact and the p-CNFET drain contact, as well as the local interconnect. After 
establishing the p-CNFET source and drain contacts, we passivate the p-CNFETs 
by depositing 100-nm-thick SiO, over only the p-CNFETs (Extended Data Fig. 1). 
Following p-CNFET passivation, the wafer undergoes an oxide densification anneal 
in forming gas (dilute H2 in N2) at 250°C for 5 min. This concludes the p-CNFET 
fabrication. To fabricate the n-CNFETs, the fourth metal layer (100-nm-thick tita- 
nium, n-CNFET source and drain contacts) are defined (Extended Data Fig. 1k, 
similar to the p-CNFET source and drain contact definition). For the electrostatic 
doping, nonstoichiometric HfO, is deposited through atomic-layer deposition 
at 200°C uniformly over the wafer. Finally, we lithographically pattern and etch con- 
tact vias (Extended Data Fig. 1m) through the HfO, for metal contacts to the bot- 
tom metal layers, and then etch the HfO, covering the p-CNFETs (the p-CNFETs 
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are protected during this etch by the SiO. passivation oxide deposited previously). 
Additional experimental characterization of the MIXED method (step two of our 
MMC) is shown in Extended Data Fig. 6. 

Back-end-of-line metal routing. Following the CNT CMOS fabrication, conven- 
tional back-end-of-line metallization is used to define additional metal layers over 
the CNFETs (for example, for power distribution and signal routing). As the metal 
layers below the CNFETs are primarily used for signal routing, we use the top 
(fifth) metal layer in the process for power distribution (Extended Data Fig. 1n). 
Additional metal can be deposited over the input/output pads for wire bonding and 
packaging. At the end of the process, the wafer undergoes a final anneal in forming 
gas at 325°C. The finished wafer is diced into chips, and each chip can be packaged 
for testing or probed for standard cell library characterization. 

This 3D physical architecture (with metal routing below and above the CNFETs) 

is uniquely enabled by the low-temperature processing of the CNFETs. The solu- 
tion-based deposition of the CNTs decouples the high-temperature CNT synthesis 
from the wafer, enabling the entire CNFET to be fabricated with a maximum pro- 
cessing temperature below 325°C. This enables metal layers and the gate stack to 
be fabricated before the CNFET fabrication takes place. This is in contrast to silicon 
CMOS, which requires high-temperature processing (for example, >1,000°C) for 
steps such as doping activation annealing. This prohibits the fabrication of silicon 
CMOS over pre-fabricated metal wires, as the high-temperature silicon CMOS 
processing would damage or destroy these bottom metal layers****. 
Experimental measurements. A supply voltage (Vpp) of 1.8 V is chosen to maxi- 
mize the noise resilience of the CNT CMOS digital logic, given the experimentally 
measured transfer characteristics of the fabricated CNFETs (noise resilience is quan- 
tified by the SNM metric (see main-text section ‘DREAM)). To interface with each 
RV16X-NANO chip, we use a high channel count data acquisition system (120 chan- 
nels) that offers a maximum clock frequency of 10 kHz while simultaneously sam- 
pling all channels. This limits the frequency we run RV16X-NANO at to 10 kHz, at 
which the power consumption is 969 1W (dominated by leakage current). However, 
this is not the maximum clock speed of RV16X-NANO; during physical design, 
using an experimentally calibrated CNFET compact model and process design kit 
in an industry-practice VLSI design flow, the maximum reported clock frequency 
is 1.19 MHz, reported by Cadence Innovus following placement-and-routing of all 
logic gates. Future work may improve CNFET-level metrics (for example, improve- 
ments in contact resistance, gate stack engineering, CNT density and CNT alignment 
to increase CNFET on-current) to further speed up clock frequency. 
VLSI design methodology. The design flow of RV16X-NANO leverages only 
industry-standard tools and techniques. We have created a standard process 
design kit for CNFETs as well as a library of standard cells for CNFETs that is 
compatible with existing EDA tools and infrastructure without modification. This 
enables us to leverage decades of existing EDA tools and infrastructure to design, 
implement, analyse and test arbitrary circuits using CNFETs, which is important 
to enable CNFET circuits to be widely adopted in the mainstream. This is the first 
experimental demonstration of a complete process design kit and library for an 
emerging beyond-silicon nanotechnology. 

A high-level description of RISC-V implementation is written in Bluespec 
and then compiled into a standard RTL hardware description language: Verilog. 
Bluespec enables testing of all instructions (listed in Extended Data Table 1) writ- 
ten in assembly code (for example, using the assembly language commands) to 
verify proper functionality of the RV16X-NANO. The functional tests for each 
instruction are also compiled into waveforms and tested on the RTL generated by 
Bluespec, they are verified using Verilator to verify proper functionality of the RTL 
(inputs and outputs are recorded and analysed as value change dump (.vcd) files). 
RTL descriptions of each module are shown in Fig. 2. 

Next is the physical design of RV16X-NANO, including logic synthesis with a 
DREAM-enforcing standard cell library (see Methods section ‘DREAM method 
implementatior), placement and routing, parasitic extraction, and design sign-off 
(that is, design rule check, layout versus schematic, verification of the final Graphic 
Database System, GDSII), as shown in Fig. 4. The RTL is synthesized into digital 
logic gates using Cadence Genus, using the following components of the CNFET 
process design kit and standard cell library: the LIBERTY file (lib) containing 
power/timing information for all standard library cells, the cell macro library 
exchange format file (.macro.lef) containing abstract views of all standard library 
cells (for example, signal/power pin locations and routing blockage information), 
the technology library exchange format file (.tech.lef) containing metal routing 
layer information (for example, metal/via width/spacing), and the back-end-of-line 
parasitic information (.qrcTech file). To enforce DREAM, we use a subset of library 
cells in the standard cell library, including cells with inverter- and nand2-based 
logic stages (for combinational logic), and logic stages using tri-state inverters (for 
sequential logic), as well as fill cells (to connect power rails) and decap cells (to 
increase capacitance between power rails Vpp and Vss); specifically, these 23 cells 
comprise (see Extended Data Fig. 3): and2_x1, buf_x1, buf_x2, buf_x4, buf_x8, 
decap_x3, decap_x4, decap_x5, decap_x6, decap_x8, dff2xdlh_x1, fand2stk_x1, 
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inv_x1, inv_x2, inv_x4, inv_x8, inv_x16, mux2nd2_x1, nand2_x1, nor2nd2_ x1, 
or2nd2_x1, xnor2nd2_x] and xor2nd2_x1. During synthesis, all output pads are 
buffered with library cell buf_x8 to drive the output pad so that no signal simul- 
taneously drives an output pad as well as another logic stage to prevent excessive 
capacitive loading in the core. Also, to minimize routing congestion in preparation 
for place-and-route, the register file (containing four registers, as described in 
Fig. 2) is directly synthesized from the Verilog hardware description language 
(instead of being designed ‘by hand’ or using a memory compiler) so that the 
D-flip-flops (dff2xdlh_x1: Extended Data Fig. 3) comprising the state elements 
(registers) can be dispersed throughout the chip to lower the overall total wire 
length. The final netlist is flattened so there is no hierarchy, and so logic can be 
optimized across module boundaries, and is then exported for place and route. 

Placement-and-routing is performed using Cadence Innovus, loading the 
synthesized netlist output from Cadence Genus. The core floorplan for standard 
library cells is defined as 6.912 mm x 6.912 mm. Given the standard cell library 
and logic gate counts from synthesis (and2_x1: 188, buf_x1: 3, buf_x8: 82, buf_x16: 
25, dff2xdlh_x1: 68, fand2stk_x1: 15, inv_x1: 75, inv_x2: 15, inv_x4: 10, inv_x8: 27, 
mux2nd2_x1: 189, nand2_x1: 625, nor2nd2_x1: 27, or2nd2_x1: 211, xnor2nd2_x1: 
14 and xor2nd2_x1: 8), the resulting standard cell placement utilization is 40%. The 
pad ring for input/output is defined as another cell with 160 pads: 40 on each side, 
with minimum width 170 zm and minimum spacing 80 1m, totalling pitch 250 jum. 
Inputs are primarily towards the top of the chip, outputs are primarily on the bottom, 
and power/ground (Vpp/ Vss) pads are on the sides (Fig. 1). 1. In addition to the core 
area, an additional boundary of 640 1m is permitted for signal routing around the 
core area (containing all standard library cells), for example, for relatively long global 
routing signals. Placement is performed while optimizing for uniform cell density 
and low routing congestion. The power grid is defined on top of the core area using 
the fifth metal layer (as shown in Fig. 1), while not consuming any additional routing 
resources within the metal layers for signal routing. The clock tree is implemented 
as a single high-fanout net loaded by all 68 D-flip-flops (for each of CLK and the 
inverted clock: CLKN), which is directly connected to an input pad, to minimize 
clock skew variations between registers. All routing signals and vias are defined on 
a grid, with routing jogs enabled on each metal layer to enable optimization target- 
ing maximum spacing between adjacent metal traces. After this stage of routing, 
incremental placement is performed to further optimize congestion, and then filler 
cells and decap cells are inserted to connect the power rails between adjacent library 
cells and to increase capacitance between Vpp and Vg to improve signal integrity. 
After this incremental placement, the final routing takes place, reconnecting all 
the signals and routing to the pads, including detailed routing to fix all design rule 
check violations (for example, metal shorts and spacing violations). Finally, parasitic 
resistance and capacitances are extracted to finalize the power/timing analysis, and 
the final netlist is output to quantify the SNM for all pairs of connected logic stages. 
The GDSII is streamed out from Cadence Innovus and is imported into Cadence 
Virtuoso for final design rule check and layout versus schematic, using the stand- 
ard verification rule format files with Mentor Graphics Calibre. The synthesized 
netlist is again used in the RTL functional simulation environment to verify proper 
functionality of all instructions, using Synopsys VCS, with waveforms for each test 
stored in a value change dump (.vcd) file. We note that these waveforms constitute 
the input waveforms to test the final fabricated CNFET RV16X-NANO, as well as 
the expected waveforms output from the core, as shown in Fig. 3. 

Once the GDSII for the core is complete, it is instantiated in a full die, which 
contains the core in the middle, alignment marks and test structures (including 
all standard library cells, CNFETs and test structures to extract wire/via parasitic 
resistance and capacitance) around the outside of the core as shown in Extended 
Data Fig. 2. This die (2 cm x 2 cm) is then tiled onto a 150-mm wafer, each of 
which comprises 32 dies (6 x 6 array of dies minus 4 dies in the corners). Each 
layer in the GDS is flattened for the entire wafer and then released for fabrication. 
DREAM method implementation. To implement DREAM: 

1) Generate the DREAM SNM table—for each pair of logic stages in the stand- 
ard cell library, quantify the susceptibility of the pair to metallic CNTs as follows: 
use the variation-aware CNFET SNM model (Extended Data Fig. 9) to compute 
SNM for all possible combinations of whether or not each CNFET comprises an 
metallic CNT (for example, in a (nand2, nor2) logic stage pair, there are 256 such 
combinations because there are 8 total CNFETs (2° = 256)). Record the minimum 
computed SNM in the DREAM SNM table (Fig. 6b, Extended Data Fig. 9). 

2) Determine prohibited logic stage pairs—choose an SNM cut-off value 
(SNMc), such that all logic stage pairs whose SNM in the DREAM SNM table is 
less than SNM are prohibited during physical design (see example in Fig. 6b: green 
entries satisfy SNMc whereas red entries prohibited cascaded logic gate pairs). The 
method of choosing SNMc is described below. 

3) Physical design—use industry-practice design flows and EDA tools to imple- 
ment VLSI circuits without using the prohibited logic stage pairs. Ideally, EDA tools 
will enable designers to set which logic stage pairs to prohibit during power/timing/ 
area optimization, but this is currently not a supported feature. To demonstrate 


DREAM in this work, we create a DREAM-enforcing library that comprises a 
subset of library cells such that no possible combination of cells can be connected 
to form a prohibited logic stage pair. 

To choose SNMc, we use a bisection search. A larger SNMc prohibits more 
logic stage pairs, resulting in better py with higher energy/delay/area cost (and 
vice versa). To satisfy target poms constraints (for example, prams > 99%), while 
minimizing cost, we optimize SNMc as follows. Step 1: Initialize a lower bound 
Land upper bound U for SNMc. L = 0, and Uis the maximum value of SNMc that 
enables EDA tools to synthesize arbitrary logic functions (for example, prohibit- 
ing all logic stage pairs except (inv, inv) would be insufficient). Step 2: Find pyms 
using SNMc = (L + U)/2, using the design flow in Extended Data Fig. 9. Record 
the set of prohibited logic stage pairs, as well as the circuit physical design, pyms, 
energy, delay and area. Step 3: If pysg satisfies the target constraint (for example, 
Pyros = 99%), set U = SNMc. Otherwise set L = SNMc. Step 4: Set SNMc = 
(L + U)/2. If prs has already been analysed for the resulting set of prohibited logic 
stage pairs, terminate. Otherwise, return to step 2. 

For all physical designs recorded in step 2 we choose the physical design 
that satisfies the target pyms constraint with minimum energy/delay/area cost. 
Importantly, the cost of implementing DREAM is <10% energy, <10% delay and 
<20% area. To integrate DREAM within EDA tools—enabling pyms optimization 
simultaneously with power/timing/area optimization—is a goal for future work on 
improving p, versus power/timing/area trade-offs. The effect that the remaining 
metallic CNTs have on EDP is shown in Extended Data Fig. 7. 


Data availability 

The data that supports the findings of this study are shown in Figs. 1-6, Extended 
Data Figs. 1-9, and Extended Data Table 1, and are available from the correspond- 
ing author on reasonable request. 
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Extended Data Fig. 1 | Fabrication process flow for RV16X-NANO. The fabrication process is a 5-metal-layer (M1 to M5) process and involves >100 
individual process steps. s-CNT, semiconducting CNT; S/D, source/drain. 
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ANGEaE SEUSS EES library cell optical image layout ‘schematic ‘experimental waveform 
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Extended Data Fig. 3 | CNFET standard cell library. List of all of 

the standard cells comprising our standard cell library, along with a 
microscopy image of each fabricated standard cell, the schematic of 
each cell, and a typical measured waveform from each fabricated cell. As 
expected for static CMOS logic stages, the CNFET logic stages exhibit 
output voltage swing exceeding 99% of Vpp, and achieve gain of >15. 


Experimental waveforms are not shown for cells whose functionality is 
not demonstrated by output voltage as a function of either input voltage 
or time; for example, for cells without outputs (for example, fill cells: 

cell names that start with ‘fill’ or decap cells: cell names that start with 
‘decap_’), for cells whose output is constant (tied high/low: cell names that 
start with ‘tie_’), or for transmission gates (cell names that start with ‘tg_’). 


ARTICLE 


' 


— 
aN 


\ 


Extended Data Fig. 4 | Image of a completed RV16X-NANO 150-mm wafer. Each wafer includes 32 dies (single die shown in Extended Data Fig. 2). 
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Extended Data Fig. 5 | Negligible effect of RINSE on CNTs and CNFETs. 
a, CNT density is the same pre- versus post-RINSE. b, CNFET Ip-Vgs 
exhibit minimal change for sets of CNFETs fabricated with and without 
RINSE (Vps = —1.8 V for all measurements shown). Both samples came 
from the same wafer, which was diced after the CNT deposition but before 
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eo ASE: 
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the RINSE process. One sample underwent RINSE while the other sample 
did not. c, CNFETs can still be doped NMOS after the RINSE process, 
leveraging our MIXED process (Vps = —1.2 V for all measurements 
shown). 
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Extended Data Fig. 6 | MIXED CNFET CMOS characterization. 

a, Definitions of key metrics for characterizing logic gates, including 
SNM, gain and swing. Vou, Vir Vir and Vj, (labelled on the VTCs in 

a, where (Vj, Vow) and (Vy, Vor) are the points on the VTC 

where A Vout/AVjn = —1) are used to extract the noise margin: 

SNM = min(SNMy, SNM_). b, Key metrics extracted for the 10,400 
CNFET CMOS nor? logic gates measured in Fig. 5 (metrics defined in a). 
This is the largest CNT CMOS demonstration to date, to our knowledge. 
Vpp is 1.2 V. c, SNM is extracted based on the distributions from b. We 
analyse >100 million logic gate pairs based on these experimental results. 
d, Spatial dependence of Vj (as an example parameter to compute SNM). 
Each pixel represents the Viy of the nor2 at that location in the die. 
Importantly, Vi increases across the die (from top to bottom). The change 
in Vj corresponds with slight changes in CNFET threshold voltage. 


The fact that the threshold voltage variations are not independently 

and identically distributed (i.id.), but rather have spatial dependence, 
illustrates that a portion of the threshold voltage variations (and therefore 
variation in SNM) is due to wafer-level processing-related variations (CNT 
deposition is more uniform across the 150-mm wafer). Future work should 
optimize processing steps, for example, increasing the uniformity of the 
atomic-layer-deposition oxide deposition used for electrostatic doping to 
further improve SNM for realizing VLSI circuits. e, Wafer-scale CNFET 
CMOS characterization. Measurements from 4 dies across 150-mm wafer 
(1,000 CNFET CMOS nor2 logic gates are sampled randomly from the 
10,400 such logic gates in each die). No outliers are excluded. Yield and 
performance variations are negligible across the wafer, illustrated by the 
distribution of the output voltage swing. 
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Extended Data Fig. 7 | Effect of metallic CNTs on digital VLSI circuits. 
a, Reduction in CNFET EDP benefits versus ps (metallic CNTs increase 
Iorp degrading EDP). ps + 99.999%, sufficient to minimize EDP cost 

due to metallic CNTs to <5%. b, pyos versus ps (metallic CNTs degrade 
SNM), (shown for SNMg = Vpp/5, and for a circuit of one million logic 
gates). Although 99.999% ps is sufficient to limit EDP degradation to <5%, 
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panel b shows that SNM imposes far stricter requirements on purity: ps © 
99.999999% (that is, number of 9s is 8) to achieve pros > 99% (number of 
9s is 2). Results in panels a and b are simulated for VLSI circuit modules 
from a 7-nm node processor core (see Supplementary Information and 
Methods for additional details). 
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Extended Data Fig. 8 | Methodology to solve VTCs using CNFET I-V 
measurements. a, Experimentally measured Ip versus Vgs for all 1,000 
NMOS (Vps = 1.8 V) and 1,000 PMOS CNFETs (Vps = —1.8 V), with 
no CNFETs omitted. Metallic CNTs (m-CNTs) present in some CNFETs 
result in high off-state leakage current (Iopp = Ip at Veg = 0 V). b, VTC 
and SNM parameter definitions, for example, for (nand2, nor2). DR is 
the driving logic stage; LD is the loading logic stage. SNM = min(SNMy, 
SNM_), where SNMxy = Voq?® = Vit?) and SNMz = Vip") = Vo. 
c-e, Methodology to solve VTCs (for example, for nand2) using 
experimentally measured CNFET I-V curves. c, Example Ip versus Vps 
for NMOS and PMOS CNFETs (Vgs is swept from —1.8 V to 1.8 V in 
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(d) (e) 


Vop VTC: fpp == ipy 


0.1-V increments). d, Schematic. To solve a VTC (for example, Vour 
versus Va with Vg = Vpp): for each Va, find V, and Voyr such that ip, + 
ipp = ina = inp (DC, direct current, convergence). e, Current in the pull-up 
network (ipy, where ipy = ipa + ipp, and ipa and ipg are the labelled drain 
currents of the PMOS FETs gated by A and B, respectively) and current in 
the pull-down network (ipp, where ipp = ina = inp, and ina and inp are the 
labelled drain currents of the NMOS FETs gated by A and B, respectively) 
versus Voyr and Va. The VTC is seen where these currents intersect. 
CNFETs are fabricated at a ~1 j1m technology node, and the CNFET width 
is 19 jm in panel a. 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | DREAM implementation and methodology. 

a, Standard cell layouts (derived using the ‘asap7sc7p5t’ standard cell 
library*”), illustrating the importance of CNT correlation: because the 
length of CNTs (which can be of the order of hundreds of micrometres) 
is typically much longer compared with the CNFET contacted gate pitch 
(CGP, for example about 42-54 nm for a 7-nm node*’), the number of 
s-CNTs and m-CNTs in CNFETs can be uncorrelated or highly correlated 
depending on the relative physical placement of CNFET active regions*®. 
For many CMOS standard cell libraries at sub-10-nm nodes (for example 
refs *”°), the active regions of FETs are highly aligned, resulting in highly 
correlated number of m-CNTs among CNFETs in library cells, further 
degrading VTCs (because one m-CNT can affect multiple CNFETs 
simultaneously). b-f, Generating a variation-aware CNFET SNM model, 
shown for a D-flip-flop (dff) derived from the asap7sc7p5t standard 

cell library*’. b, Layout used to extract netlists for each logic stage. 

c, Schematic: CNFETs are grouped by logic stage (with nodes arbitrarily 
labelled ‘D’, “MH? ‘MS’ ‘SH? ‘SS} ‘CLK; ‘clkn, ‘clkb’ and ‘QN’ for ease of 
reference). d, For each extracted netlist, there can be multiple VTCs: 

for each logic stage output, a logic stage input is sensitized if the output 
state (0 or 1) depends on the state of that input (given the states of all 

the other inputs). For example, for a logic stage with Boolean function: 

Y = !(A*B+C), C is sensitized when (A, B) = (0, 0), (0, 1) or (1, 0). We 
simulate all possible VTCs (over all logic stage outputs and sensitized 
inputs), and also in the presence of m-CNTs. For example, panel d shows 
a subset of the VTCs for the logic stage in panel b with output node 

‘MH (labelled in panel c), and sensitized input ‘D’ (with labelled nodes 
(‘clkb; ‘clkn; ‘MS’) = (0, 1, 0)). The dashed line indicates VTC with no 
m-CNTs, and the solid lines are example VTCs in the presence of m-CNTs 
(including the effect of CNT correlation). In each case, we model Vou; 
Vin» Viz and Vo, as affine functions of the number of m-CNTs (Mj) in each 
of r regions (Mj, ..., M,), with calibration parameters in the static noise 


margin (SNM) model matrix T (shown in panel f). e, Example calibration 
of the SNM model matrix T for the VTC parameters extracted in panel d; 
the symbols are VTC parameters extracted from circuit simulations (using 
Cadence Spectre), and solid lines are the calibrated model. f, Affine model 
form. g-j, VLSI design and analysis methodology. g, Industry-practice 
physical design flow to optimize energy and delay of CNFET digital VLSI 
circuits, including: (1) library power/timing characterization (using 
Cadence Liberate) across multiple Vpp and using parasitics extracted 
from standard cell layouts (derived from the asap7sc7p5t standard cell 
library), in conjunction with a CNFET compact model. (2) Synthesis 
(using Cadence Genus), place-and-route (using Cadence Innovus) with 
back-end-of-line (BEOL) wire parasitics from the ASAP7 process design 
kit (PDK). (3) Circuit EDP optimization: we sweep both Vpp and target 
clock frequency (during synthesis/place-and-route) to create multiple 
physical designs. The one with best EDP is used to compare design 
options (for example, DREAM versus baseline). h, Subset of logic gates 

in an example circuit module, showing the effect of CNT correlation at 
the circuit level (for example, the m-CNT counts of CNFETs P3,1 and 
P5,1 are both equal to M; + M + M3)*°. i, Distribution of SNM over 

all connected logic stage pairs, for a single sample of the circuit m-CNT 
counts. The minimum SNM for each trial limits the probability that all 
noise margin constraints in the circuit are satisfied (pms). j, Cumulative 
distribution of minimum SNM over 10,000 Monte Carlo trials, shown for 
multiple target ps values, where ps is the probability that a given CNT is a 
semiconducting CNT. These results are used to find pag versus ps for a 
target SNM requirement (SNMg), where pyws is the fraction of trials that 
meet the SNM requirement for all logic stage pairs. We note that pyms 

can then be exponentiated to adjust for various circuit sizes based on the 
number of logic gates. k, CNFET compact model parameters (for example, 
7-nm node). 
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Extended Data Table 1 | RISC-V instruction set architecture implementation details 


inst category summary assembly 

addi register-immediate arithmetic add_constant, no overflow exception addi _rd, rsl, imm 
add register-register arithmetic addition with 3 GPRs, no overflow exception add _ rd, rsl, rs2 
andi register-immediate arithmetic bitwise AND with constant andi rd, rsi, imm 
and register-register arithmetic bitwise AND with 3 GPRs and rd, rsl, rs2 
auipe register-immediate arithmetic load _(pc_+ constant) into GPR auipc rd, imm 
beq conditional branch branch if 2 GPRs are equal beg rsl, rs2, imm 
bgeu conditional branch branch based on unsigned comparison of 2 GPRs bgeu _rsl, rs2, imm 
bltu conditional branch branch based on unsigned comparison of 2 GPRs bltu_rsl, rs2, imm 
bge conditional branch branch based on signed comparison of 2 GPRs bge rsl, rs2, imm 
blt conditional branch branch based on signed comparison of 2 GPRs blt_rsl, rs2, imm 
bne conditional branch branch if 2 GPRs are not equal bne rsl, rs2, imm 
jalr unconditional jump jump_to relative address, place return address in GPR jalr rd, rsl, imm 
jal unconditional jump jump_to address, place return address in GPR jal_ rd, imm 

lh memory instruction load_short from memory into GPR lh rd, imm(rsl1 
lui register-immediate arithmetic load _upper bits of constant into GPR lui_rd, imm 

ori register-immediate arithmetic bitwise OR with constant ori rd, rsl, imm 
or register-register arithmetic bitwise OR with 3 GPRs or rd, rsl, rs2 
sh memory instruction store short into memory sh rs2, imm(rs1) 
slli register-immediate arithmetic shift left logical by constant slli_rd, rsl, imm 
sll register-register arithmetic shift left logical by GPR value sll _rd, rsl, rs2 
sltiu register-immediate arithmetic set GPR based on unsigned comparison of GPR and constant sltiu_rd, rsl, imm 
slti register-immediate arithmetic set GPR based on signed comparison of GPR and constant slti_rd, rsl, imm 
sltu register-register arithmetic set GPR based on unsigned comparison of 2 GPRs sltu_rd, rsl, rs2 
slt register-register arithmetic set GPR based on signed comparison of 2 GPRs islt_rd, rsl, rs2 
srai register-immediate arithmetic shift right arithmetic by constant srai_rd, rsl, imm 
sra register-register arithmetic shift right arithmetic by GPR value sra_rd, rsl, rs2 
srla register-immediate arithmetic shift right logical by constant srli_rd, rsl, imm 
srl register-register arithmetic shift right logical by GPR value srl rd, rsl, rs2 
sub register-register arithmetic subtraction with 3 GPRs, no overflow exception sub rd, rsl, rs2 
xori register-immediate arithmetic bitwise XOR with constant xori rd, rsl, rs2 
xor register-register arithmetic bitwise XOR with 3 GPRs xor rd, rsl, rs2 


inst format instruction 
(type format 
-imm) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 


addi I-I imm[11:0] wot {4324 irsl funct3=ADD rd{4:2}+ rd opcode=OPIMM 
add R 0 0 0 0 0 0 0 tes2{42}+ rs2 post [42y rsl funct3=ADD rd{4:2}+ rd opcode=O0P 
andi I-I imm[11:0] wt {4324 rsl funct3=AND #d{4+2}+ rd opcode=OPIMM 
and R 0 0 0 0 0 0 0 tes2{4+2}+ rs2 west {4224 rsl funct3=AND d{4+24 rd opcode=OP 
auipce I-U som {342464 imm[15:12] ted {4224 rd opcode=AUIPC 
beq s-B imm[10:5] fes2{4+2}+ rs2 wot {4324 rsl funct3=BEQ imm[4:1] opcode=BRANCH 
bgeu S-B imm[10:5] res2{4+2+ rs2 wot {4324 rsl funct3=BGEUimm[4:1] opcode=BRANCH 
bltu S-B imm[10:5] tes2{4+2}+ rs2 west {4324 rsl funct3=BLTUimm[4:1] opcode=BRANCH 
bge Ss-B imm[10:5] wes2{42}+ rs2 wot {4324 rsl funct3=BGE imm[4:1] opcode=BRANCH 
blt Ss-B imm[10:5] res2{4+2}+ rs2 wt {4324 rsl funct3=BLT imm[4:1] opcode=BRANCH 
bne s-B imm[10:5] wes2{4+24+ rs2 west {4324 rsl funct3=BNE imm[4:1] opcode=BRANCH 
alr wT imm[11:0 wesi{4+2}+ rsl 0 0 OO rdf{4:2+ rd opcode=JALR 
jal U-3 imm[10:1] Homme {9-4-6} imm[15:12] wed {42234 rd opcode=JAL 

lh = T imm[11:0 wst{4+24 rsl funct3=LH rd{4:+2}+ rd opcode=LOAD 
lui I-U some {S44 64+ imm[15:12] ted 42+ rd opcode=LUI 
ori MSM imm[11:0 wst{4224 rsl funct3=OR  #d{4:+2}+ rd opcode=OPIMM 
or R 0 0 0 0 0 0 0 ws2{4224+ rs2 w2st{4224 rsl funct3=OR  #d{4:2}+ rd opcode=O0P 

sh s-s imm[11:5 tes2 {4-34 rs2 west+{4224 rsl funct3=SH  |imm[4:0] opcode=STORE 
Syilalsy ae 0 0 0 0 0 0 0 imm[3:0] west f{4324 irsl funct3=SLL rd{4+2}+ rd opcode=OPIMM 
eyila R 0 0 0 0 0 0 tes2{422}+ irs2 w23t{4224 rsl funct3=SLL rd{4+2}+ rd opcode=O0P 
sltiu I-I imm[11:0 wst{4324 irsl funct3=SLTUrd{4+2}+ rd opcode=OPIMM 
slits ior imm[11:0 w3t{4324 irsl funct3=SLT #d{4+2}+ rd opcode=OPIMM 
sltu R 0 0 0 0 0 0 0 wes2{4+24+ rs2 23t{4224 irsl funct3=SLTUrd{4+2}+ rd opcode=O0P 

Isat R 0 0 0 0 0 0 0 es2{4s2}+ irs2 23t{4224 irsl funct3=SLT rd{4:+2}+ rd opcode=0P 
Strat io—T 0 1 0 0 0 0 0 imm[3:0] w23t{4224 rsl funct3=SR_ed{4+2}+ rd opcode=OPIMM 
sra R 0 uj 0 0 0 0 0 tes2{4+2}+ rs2 ~3t{4224 rsl funct3=SR_rd{4+2}+ id. opcode=O0P 
Seals Sar 0 0 0 0 0 0 0 imm[3:0] 23t{4224 rsl funct3=SR_rd{4+2}+ rd opcode=OPIMM 
srl R 0 0 0 0 0 0 0 es2{422}+ rs2 w23t{4224 rsl funct3=SR_rd{4+2}+ rd opcode=0P 

sub R 0) 1 0) 0 0 0 0 2324334 rs2 wot {4+34 rsl funct3=ADD #d{4+2+ rd opcode=0P 
Kori (0-1 imm[11:0] 23t{4224 rsl funct3=XOR ¥d{4+2}+ rd opcode=OPIMM 
xor R 0) 0 0) 0 0 0 0 ws2{4+34 rs2 wot {4334 rsl funct3=XOR #d{4+2+ rd opcode=0P 

The top panel shows all supported instructions implemented in RV16X-NANO, adhering to RISC-V format specifications for RV32E, with high-level description summary for each. Each instruction is 
categorized into one of six formats, including instruction type (R-type, I-type, S-type, U-type) and immediate variant (l-immediate, U-immediate, B-immediate, J-immediate, S-immediate), forming one 
of six formats (type immediate): R, I-I, |-U, S-B, S-S, U-J (shown in the bottom panel). For the assembly code, ‘rd’ is the destination register, ‘rs1’ is the source register 1, ‘rs2’ is the source register 2, 


‘imm’ is immediate. The bottom panel shows the bit-level description of each instruction format. The bottom 7 bits (inst[6:0]) are always the OPCODE, and then the remaining bits are decoded 
depending on the instruction format (determined by the OPCODE). Values that are crossed out indicate bits that are not used for the 16-bit data path implementation (RV16E) with four registers, 
instead of 32-bit data path implementation (RV32E) with 16 registers. For example, for instruction ‘auipc’, only 2 of the 5 reserved bits for ‘rd’ are required to address the register file for register ‘rd’ 
(because there are only 2 = 4 registers instead of 2° = 32), and also the upper 16 bits of the 32-bit immediate (that is, imm[31:16]) are not used because the data path is truncated to 16 bits. 
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Metastatic- niche labelling reveals 
parenchymal cells with stem features 


Luigi Ombrato!, Emma Nolan!, Ivana Kurelac!?, Antranik Mavousian?, Victoria Louise Bridgeman!, Ivonne Heinze’, 
Probir Chakravarty°, Stuart Horswell®, Estela Gonzalez-Gualda!, Giulia Matacchione!, Anne Weston®, Joanna Kirkpatrick’, 
Ehab Husain’, Valerie Speirs®, Lucy Collinson®, Alessandro Ori*, Joo-Hyeon Lee*** & Ilaria Malanchi!* 


Direct investigation of the early cellular changes induced by metastatic cells within the surrounding tissue remains a 
challenge. Here we present a system in which metastatic cancer cells release a cell- penetrating fluorescent protein, which 
is taken up by neighbouring cells and enables spatial identification of the local metastatic cellular environment. Using this 
system, tissue cells with low representation in the metastatic niche can be identified and characterized within the bulk 
tissue. To highlight its potential, we applied this strategy to study the cellular environment of metastatic breast cancer 
cells in the lung. We report the presence of cancer-associated parenchymal cells, which exhibit stem-cell-like features, 
expression of lung progenitor markers, multi-lineage differentiation potential and self-renewal activity. In ex vivo assays, 
lung epithelial cells acquire a cancer-associated parenchymal-cell-like phenotype when co-cultured with cancer cells 
and support their growth. These results highlight the potential of this method as a platform for new discoveries. 


Cancer cell behaviour is strongly influenced by the surrounding cells in 
the tumour microenvironment (TME). Various cell types in the TME 
are known to influence cancer cell behaviour, including mesenchymal 
cells such as activated fibroblasts, pericytes and endothelial cells, as well 
as different types of inflammatory cells’. 

During the early phase of metastatic growth, cancer cells generate 
a local TME (metastatic niche), which is distinct from the normal tis- 
sue structure and key for supporting metastatic outgrowth’. However, 
detailed analysis of the cellular composition of the metastatic niche, 
especially at early stages, is constrained by the difficulty of spatially 
discriminating the metastatic-niche cells within the bulk tissue. This 
hampers the identification of cells that might respond to early coloni- 
zation by cancer cells but remain low in number as metastases grow. 

In this study, we present a strategy in which metastatic cancer cells 
mark their neighbouring cells, thereby identifying them in the tissue 
and overcoming these limitations. We have applied this system to 
interrogate the early metastatic environment of breast cancer cells in 
the lung. We confirm that the system enables us to quantitatively and 
qualitatively distinguish known metastatic-niche cells within the tissue, 
and identify lung epithelial cells, in which a regenerative-like program 
is activated, as a component of the metastatic TME. We show that these 
epithelial cells acquire multi-lineage differentiation potential when 
co-cultured with cancer cells and support their growth. These results 
support the notion that, in addition to the well-characterized stromal 
activation, a parenchymal response might contribute to creating the 
metastatic microenvironment. 


The mCherry niche-labelling system 

To develop a labelling system that uses metastatic cancer cells to directly 
identify their neighbouring cells in vivo, we generated a secreted flu- 
orescent mCherry protein containing a modified lipid-permeable 
transactivator of transcription (TATk) peptide** (sLP-mCherry) 
(Fig. la and Extended Data Fig. 1a). We engineered 4T1 breast can- 
cer cells to co-express the sLP-mCherry and GFP; we refer to these 


cells as labelling-4T1 cells. In vitro, sLP-mCherry protein secreted by 
labelling-4T1 cells re-enters the cells, as indicated by changes in the intra- 
cellular localization of the red fluorescence (Extended Data Fig. 1b, c). 
sLP-mCherry protein is also taken up by unlabelled cells, both in 
co-culture with labelling-4T1 cells (Fig. lb-d) and when cultured in 
medium conditioned by labelling-4T1 cells (LCM) (Extended Data 
Fig. 1d, e). Upon uptake into a cell, sLP-mCherry fluorescence has an 
intracellular half-life of 43 h (Extended Data Fig. 1f) and is localized 
in CD63* multi-lamellar bodies (lysosomal-like structures) where, 
owing to its high photostability°, it retains high fluorescence intensity 
(Extended Data Fig. 1g, h). Fractionation of LCM shows that only the 
soluble fraction retains labelling activity, whereas the extracellular vesi- 
cles, a proportion of which contain sLP-mCherry, do not show labelling 
activity in vitro (Extended Data Fig. 1i-k). 

In vivo, intravenous injection of labelling-4T1 cells (GFP*mCherry*) 
into syngeneic BALB/c mice to induce lung metastases efficiently labels 
surrounding host tissue cells (GFP- mCherry*), penetrating approx- 
imately five cell layers (Fig. le-g and Extended Data Fig. 2a, b). This 
enables specific discrimination of host cells in close proximity to cancer 
cells from distal lung cells (GFP~mCherry~ ) using fluorescence- 
activated cell sorting (FACS) (Fig. 1f). Notably, when micro-metastases 
grow larger, the number of mCherry*-niche cells in the tissue remains 
proportional to the number of metastatic cells (Extended Data Fig. 2c). 
We detected no adaptive immunogenicity against sLP-mCherry and 
the local increase of CD45* immune cells within the mCherry popu- 
lation was observed specifically as a response to cancer cells (Extended 
Data Fig. 2d—f). Thus, this mCherry-niche-marking strategy enables 
spatial reconstitution of the local metastatic niche within the tissue. 
This permits functional identification of labelled cells and direct 
comparison with unlabelled cells within the same lung. 


Tissue spatial resolution 
To demonstrate the utility of the mCherry-niche strategy to specif- 
ically interrogate the local early changes induced by cancer cells, we 
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seeded 4T 1-labelling cells in the lung via tail-vein injection. Lung 
tissue distant from micro-metastases remained unperturbed by primary- 
tumour-derived systemic changes’. To validate the mCherry-niche 
strategy, we first examined components known to be involved in 
metastatic-niche formation. CD45* immune cells were very abundant 
in the mCherry* niche and nearly exclusively derived from the myeloid 
lineage (CD11b*) (Extended Data Figs. 2d, 3a). Lung neutrophils have 
been reported to enhance metastatic growth of cancer cells®, and 
were indeed detected in the mCherry* niche (Extended Data Fig. 3b). 
Because abnormalities in lung neutrophils are often associated with 
cancer!°, we isolated mCherry* -niche neutrophils (Ly6G*) and com- 
pared their proteome to that of unlabelled neutrophils from the same 
lungs (Fig. 2a). The sub-pool of mCherry* -niche neutrophils exhibited 
an increase in translation, oxidative phosphorylation and intracellular 
reactive oxygen species (ROS) levels relative to unlabelled neutrophils, 
as determined by FACS analysis (Fig. 2b, Extended Data Fig. 3c-f and 
Supplementary Data). To validate the functional relevance of specific 
features identified in mCherry*-niche cells, we developed a 3D-scaffold 
co-culture system that mimics complex tissue-like cell-cell interactions. 
We found that lung neutrophils increased growth of actin-GFP* mouse 
mammary tumour virus (MMT'V)-polyoma virus middle T antigen 
(PyMT) breast cancer cells in a ROS-dependent manner (Fig. 2c-e and 
Extended Data Fig. 3g, h). Collectively, these data highlight the poten- 
tial of our strategy to detect in vivo changes that are spatially restricted 
to the metastatic environment. 


The non-immune mCherry*-niche signature 

Whereas the contribution of immune cells to metastatic outgrowth has 
been widely investigated!’, less is known about the role of other TME 
cell types during metastatic nesting. Notably, the mCherry-labelling 
strategy can be used to provide spatiotemporal information by apply- 
ing it to different stages of metastatic progression. We generated the 
gene-expression profile of non-immune (CD45~) mCherry*-niche 
cells at the time point immediately preceding micro-metastases as 
well as at an advanced metastatic stage (Fig. 3a, b). The majority of 
alterations were detected at the early stage, but additional changes sub- 
sequently discriminated the niche of macro-metastases (Fig. 3c and 
Extended Data Fig. 4a, b), confirming the evolution of the metastatic 
TME over time. MetaCore dataset enrichment and gene-set enrich- 
ment analysis (GSEA) highlighted changes in pathways related to 
proliferation, inflammation and tissue remodelling (Extended Data 
Fig. 4b, c). We next focused on the upregulated (more than twofold) 
genes encoding soluble factors in the mCherry~ niche at both time 
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d Fig. 1 | The mCherry-niche labelling strategy. 
Labeling-ris recipient cols Label design. Labelling-4T1 cells co-express 
the lipid-soluble cell-penetrating mCherry- 
fusion protein label and GFP. b, c, Representative 
FACS plots of naive 4T1 cells cultured alone (b) 
or co-cultured with labelling-4T1 cells (c). 
Numbers indicate the percentage of cells in the 
respective quadrant. d, Fluorescence image from 
co-cultures (scale bar, 10 zm). Data representative 
of two independent experiments (b-d). 
e-g, In vivo labelling. e, Schematic of the 
experiment®: labelling-4T1 cells are injected 
into mice; these cells metastasize in the lung 
and label nearby cells in the TME (niche) with 
mCherry. f, Representative FACS plot of a 
metastatic lung, n= 50 mice. g, Representative 
immunofluorescence images of labelling-4T1 
cell metastasis (n = 8 mice). Labelling-4T1 cells 
are positive for both GFP and mCherry, whereas 
metastatic niche cells are positive for mCherry 
only. Blue, DAPI. Scale bars: main panels, 20 jim; 
enlarged insets, 10 |1m. For gating strategy 
see Supplementary Information. 
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Fig. 2 | The mCherry-niche strategy enables characterization of 
metastatic-niche neutrophils. a, b, Proteomic analysis of FACS-sorted 
Ly6G* cells: all differentially detected proteins (a) and proteins associated 
with oxidative phosphorylation (b). c-e, Three-dimensional co-culture, 
with or without the ROS inhibitor TEMPO, of GEPt MMTV-PyMT 
cancer cells and Ly6G* cells sorted by magnetic-activated cell sorting 
(MACS). c, The co-culture scheme. d, Quantification of GFP signal 

(n = 3 independent experiments, each with 3 to 10 technical replicates). 
Data are normalized to cancer cell growth and represented as mean + s.e.m. 
Statistical analysis of biological replicates by two-way ANOVA. 

e, Representative images from three independent experiments (day 6; 
scale bar, 400 xm). 
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Fig. 3 | The mCherry-niche strategy identifies an epithelial component 
of metastatic TME. a, Schematic of metastatic progression using 
labelling-4T1 cells®. b, Experimental design for RNA-seq experiments°. 

c, Principal component analysis (PCA) of CD45" Ter119™ cell signatures 
from metastatic lungs at early (n = 3, 10 mice each) and late (n = 3, 5 mice 
each) time points. The black oval encloses the distal lung samples; red 
ovals enclose the mCherry*-niche samples to highlight their similarity 

in the PCA plot. d, Venn diagram of differentially expressed genes in 

the mCherry* niche and selected factors that are common at early and 
late stages. Wisp1 is also known as Ccn4. e, WISP 1-blocking antibody 
treatment in vivo (m = 10, 2 independent experiments). Box edges 
represent 25th and 75th percentiles, the horizontal bar is the median and 
the whiskers show the range of values. f, GSEA correlation from RNA-seq 


points (Fig. 3d and Supplementary Data). We found many previously 
reported tumour-promoting factors'?"””, further validating the ability 
of our labelling system to faithfully capture the in vivo metastatic niche. 
We also found WNT 1-induced protein (WISP 1)—which has been sug- 
gested to act as an oncogene in breast cancer*”—to be widely expressed 
in the mCherry* niche (Fig. 3d). Indeed, we detected upregulation of 
WISP1 in both cancer and metastatic-niche cells and confirmed its 
pro-metastatic activity by exogenous inhibition in vivo (Fig. 3e and 
Extended Data Fig. 5a-e). 

We next probed the TME for other non-immune cell types, which 
might be difficult to resolve by standard techniques owing to their small 
numbers. Of note, we found pathways associated with lung epithelial 
cells in the metastatic-niche signature (Fig. 3f). Micro-metastases grow 
embedded within the alveolar compartment of the lung, and we found 
alveolar type II cells (AT2) expressing surfactant protein C (SP-C, 
encoded by Sftpc) in the metastatic niche (Fig. 3g). We also found 
mCherry*-niche cells expressing the epithelial cell adhesion marker 
EPCAM, further supporting the presence of cells of lung parenchymal 
origin (Fig. 3h, i). 


Cancer-associated parenchymal cells 

We found mCherry*-niche epithelial cells to have a higher prolifer- 
ative activity compared to their normal lung counterparts (Fig. 4a). 
Concordantly, we detected alveolar cell clusters with increased pro- 
liferative activity at the metastatic borders of human breast cancer 
lung metastases, suggesting that a lung parenchymal response to 
metastatic growth may occur in both mouse and human (Extended 
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data comparing early (n = 3) or late (n = 3) mCherry* samples with their 
respective mCherry controls. g, Left, representative immunofluorescence 
images of lung tissue (n = 3 mice) showing mCherry-labelled micro- 
metastasis (red), SP-C (white) and DAPI (blue, middle). Right, enlarged 
view of areas indicated with dashed outlines. Scale bars: main panel, 

100 jum; enlarged insets, 10 jm (white arrows and dashed outlines, 
mCherry-labelled SP-C* cells). h, EPCAM* cell frequency among Lin™ 
(CD45~ CD31 Ter119~) cells in distal lung (mCherry~) and mCherry* 
cells estimated by FACS (n = 13 mice). i, Representative FACS plots from h. 
Numbers indicate the percentage of cells in the respective quadrant. 
Statistical analysis by unpaired two-tailed t-test with Welch’s correction (e), 
weighted Kolmogorov—Smirnov-like statistic with Benjamini-Hochberg 
correction (f) and paired two-tailed t-test (h). 


Data Fig. 6a—f). Cancer cells benefit from the presence of a lung paren- 
chymal response, as freshly isolated EPCAM* cells from naive lungs 
supported the growth of actin-GFP+ MMTV-PyMT tumour cells in 
our 3D-scaffold co-culture system (Fig. 4b-d). Moreover, in line with 
the results shown in Fig. 2c-e, the presence of both lung neutrophils 
and epithelial cells further enhanced tumour growth (Extended Data 
Fig. 7a—d), highlighting the cellular complexity of the metastatic niche. 

We next aimed to better define the perturbation occurring in lung 
epithelial cells in the proximity of cancer cells. To contextualize their 
presence among the other cellular components of the metastatic niche, 
we performed single-cell RNA sequencing (scRNA-seq) of CD45~ cells. 
t-Distributed stochastic neighbour embedding (t-SNE) analysis of 
mCherryt -niche cells identified a large stromal cluster in which dif- 
ferent stromal cells could be distinguished (Fig. 4e and Extended Data 
Fig. 8a—c). This is in agreement with the various known mesenchymal 
cell components of the TME, as well as the characterization of differ- 
ent fibroblast subsets*!**. Notably, specifically in the mCherry* niche, 
Epcam-expressing epithelial cells are distributed in two clusters distin- 
guished by the expression of E-cadherin (Cdh1) (Fig. 4e). We found 
that only mCherry*-niche Epcam*CdhI* cells shared the expression 
of alveolar genes”* with unlabelled distant lung Epcam” cells (Fig. 4f, g). 
Conversely, mCherry*-niche EpcamtCdh1~ cells expressed both the 
progenitor markers SCA1 (encoded by Ly6a) and Tm4sf1°*8 (Fig. 4g). 
As validation of this de-differentiated signature observed in the major- 
ity of epithelial cells in the metastatic niche, reverse transcription with 
quantitative PCR (RT-qPCR) of EPCAM-sorted mCherry* -niche 
cells also showed an overall reduction in expression of alveolar lineage 
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Fig. 4 | Lung epithelial cells in the metastatic niche display a progenitor 
phenotype. a, Ki67 staining in FACS-sorted mCherry~ and mCherry* 
EPCAMt cells (1 = 7 independent sorts). b-d, GFP* MMTV-PyMT 
cancer cell growth in 3D co-culture with MACS-sorted EPCAM* cells. 

b, The co-culture scheme. c, Representative images from 4 independent 
experiments (day 6; scale bar, 400 jum). d, Quantification of GFP 

signal (n = 4, each with 3 technical replicates, statistical analysis of 
biological replicates). Data are normalized to cancer cell growth. 

e-g, sCRNA-seq analysis; t-SNE plots of CD45~ cells from the mCherryt 
niche (e; n = 1,473) or distal lung (f; n = 1,996). g, Right, heat map of 


markers (Fig. 4h). Moreover, the enrichment of EPCAM*SCA1* cells 
in the mCherry~ niche of different metastatic cell types was confirmed 
by FACS analysis (Fig. 4i and Extended Data Fig. 9a—c). Similarly, the 
presence of epithelial cells expressing another lung progenitor marker, 
integrin 64 (also known as CD 104)’, was increased in the mCherry*- 
niche and in ex vivo co-cultures (Extended Data Fig. 9d-i). 

In summary, we describe a parenchymal response to lung metastasis 
involving de-differentiated pools of epithelial cells in the niche, which 
we define as cancer-associated parenchymal cells (CAPs). 


CAPs are activated AT2 cells 
To functionally characterize CAPs, we tested their lineage differentiation 
potential ex vivo using a 3D Matrigel-based organoid co-culture system” 
(Fig. 5a). Unlabelled resident lung EPCAM" cells are predominantly 
alveolar”’, and formed mainly alveolar organoids when co-cultured 
with CD31* cells (Fig. 5b-d). mCherryt-niche EPCAM* cells favoured 
the bronchiolar lineage and showed a remarkable capacity to generate 
multi-lineage bronchioalveolar organoids (Fig. 5d). Despite the bias in 
organoid formation towards the bronchial lineage, we did not detect 
mCherry-labelled cells expressing bronchial markers in vivo (Extended 
Data Fig. 10a). CAPs also retained high self-renewal capacity over 
multiple passages (Fig. 5e). 

Next, we tested whether tumour cells could directly induce the 
CAP phenotype. When EPCAMT cells from unlabelled distal micro- 
metastatic lungs or naive lungs were co-cultured with metastatic cells, 
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mCherry*-niche EPCAM‘ cells (ordered genes in rows and hierarchically 
clustered cells in columns); left, table shows established lineage markers 
(bold); asterisks indicate putative alveolar markers”*. h, RT-qPCR analysis 
of EPCAM* FACS-sorted cells (Sftpc, Agp5, n = 9; Sftpb, Abca3, Pdpn, 
Ager, Vim, Cdh1, n = 8; Krt6, Cdh2, n = 7; Snail, n = 4; Twist, n = 3). 
Data represented as fold change relative to mCherry~ lung EPCAM* cells 
(statistical analysis on the AC, values). i, EPCAM*SCAIF cell frequency 
among Lin” (CD45 CD31 Ter119_) cells, determined by FACS (n = 13 
mice). Statistical analysis by paired two-tailed t-test (a, h, i), one-sample 
two-tailed t-test (d). Data represented as mean + s.e.m. 


they generated a higher proportion of bronchiolar and bronchioalveolar 
organoids (Fig. 5f-h and Extended Data Fig. 10b, c). Similar altera- 
tions were induced by cancer cells when the assay was performed using 
mouse lung fibroblasts (MLg cells) instead of CD317 cells (Extended 
Data Fig. 10b, c). 

Although lung EPCAM* cells are predominantly alveolar, they 
also contain epithelial progenitors that could be enriched by cancer 
cells to generate increased plasticity?”*°. Therefore, we performed 
organoid cultures using lineage-labelled AT2 (Sftpc-lineage) cells. 
Sftpc-lineage cells, which show no plasticity in co-culture with 
CD31* cells, generated multi-lineage bronchioalveolar organoids 
when exposed to cancer cells, supporting the idea of a reprogram- 
ming activity driven by cancer-cell-derived factors ex vivo (Fig. 5i, j). 
Despite the potential of cancer cells to modulate the organoid forma- 
tion ability of lineage-labelled club cells (Scgb1a1 lineage), only rare 
single Scgb 1a1-lineage cells were found in proximity to lung metas- 
tases (Extended Data Fig. 10d-f). Conversely, metastases growing 
in Sftpc-lineage lungs demonstrated the alveolar (AT2) origin of the 
CAPs (Fig. 5k). 

Recently, a rare population of AT2 cells expressing Axin2 with stem 
cell and repair activity (AT2 stem cells), was described in the lung 
alveoli*!. Whereas a small proportion of Axin2-expressing cells was 
found in the unlabelled epithelial cluster, Axin2 was undetectable in 
the mCherry*-niche EPCAM clusters (data not shown). Therefore, 
even if cancer cell seeding could trigger lung injury, this phenomenon 
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Fig. 5 | CAPs show multi-lineage differentiation potential. ae, Lung 
organoids: co-culture scheme (a); representative bright-field images 

(b; scale bar, 100 1m); representative immunofluorescence of organoid 
sections stained with the indicated markers (c; scale bar, 50 1m); 
quantification (d) and organoid formation efficiency after passaging (e). 
Ac-tubulin, acetyl tubulin. f-h, Lung organoid cultures with or without 
labelling-4T1 cells: co-culture scheme (f); representative bright-field 
images (g; scale bar, 100 j.m) and quantification (h). i, j, Lung organoids 
with Sftpc-Cre®®”? lineage cells with or without non-labelling 4T1-GFP 


does not appear to specifically maintain an Axin2* AT2 population in 
the metastatic niche. 

Collectively, these data demonstrate the alveolar origin of CAPs and 
the ability of cancer cells to induce multi-lineage differentiation poten- 
tial of epithelial cells ex vivo. 


Discussion 

This study introduces the mCherry-niche labelling system and 
demonstrates its ability to resolve the host tissue cellular environment 
in regions surrounding cancer cells. We report the presence of a lung 
epithelial compartment within the metastatic niche, which originates 
from AT2 cells. We define this TME component as CAPs and describe 
their activated regenerative state by showing their de-differentiated 
signature, tissue stem-cell-like features, multi-lineage differentiation 
potential and increased self-renewal activity. 

Parenchymal cells have been described as triggering a tissue-wide 
pro-tumorigenic inflammatory response to systemic primary tumour 
signals*”??. In addition to these systemic effects, we here show that a 
regenerative-like activation in the lung parenchyma occurs as a direct 


cancer cells 


cells: quantification (i) and representative bright-field images (j; scale bar, 
150 jum). Images are representative of six (b, c, g) or three (j) organoid 
cultures. Data generated from independent sorts (d, h, i) and represented 
as cumulative percentage using the mean + s.d. of three co-cultures per 
sorting. k, Representative staining of lineage cells in metastatic lungs 
from Sftpc-Cre®®” mice injected with E0771 (1 = 3; scale bar, 50 pm) or 
MMTV-PyMT (n = 3; scale bar, 100 ppm) cancer cells. Statistical analysis 
by unpaired two-tailed t-test (d, e, h) and one-sample two-tailed t-test (i) 
on original non-cumulative values. 


local response during breast cancer metastasis. This parenchymal 
response, combined with the stromal activation, is potentially a key 
orchestrator of tumour-niche formation. 

Together these results consolidate the mCherry-niche system as a 
platform for discoveries with the potential to identify, isolate and func- 
tionally test cells from the metastatic niche with high spatial resolution. 
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METHODS 


Sample sizes were estimated based on previous experiments conducted in our 
laboratory, providing sufficient numbers of mice in each group to yield a two- 
sided statistical test, with the potential to reject the null hypothesis with a power 
(1 — B) of 80%, subject to « = 0.05. No further statistical methods were used 
to predetermine sample size. Most experiments were not randomized: only the 
experiment involving treatment was randomized. Whenever possible, investigators 
were blinded to allocation during outcome assessment. 

Statistical analysis. Statistical analyses were performed using Prism v.7.0c 
(GraphPad Software). P values were obtained from two-tailed Student's t-tests 
with paired or unpaired adjustment. When needed, unpaired t-tests were adjusted 
using Welch's correction for unequal variance. In one instance (Fig. 4i), data in one 
of the groups did not pass the D'Agostino and Pearson normality test, therefore 
a Wilcoxon matched-pairs signed-rank test was performed. Single-sample tests 
were also used for comparisons of co-cultured cancer cell growth on scaffolds to 
the normalized value of cancer cells alone. For comparisons between two scaffold 
conditions of growth over time or to perform multiple analysis between experi- 
mental groups, two-way ANOVA was used. 

Mouse strains. All mice used are available from the Jackson Laboratory. MMT V- 
PyMT mice™ are on a FVB and C57BL/6 background, actin-GFP mice* and Rag] 
KO mice are on the FVB background (gift from J. Huelsken laboratory (EPFL, 
Lausanne, Switzerland)). Sftpc-Cre™R™36, Rosa26R-YFP*” (Sftpc-Cre®®"?;R26R- 
YFP) mice are on a C57BL/6 background. BALB/c] mice and the above-mentioned 
lines were bred and maintained under specific-pathogen-free conditions by The 
Francis Crick Biological Research Facility and female mice were used between 
6 and 10 weeks of age. Breeding and all animal procedures were performed at the 
Francis Crick Institute in accordance with UK Home Office regulations under 
project license P83B37B3C. 

For ex vivo organoid lineage-tracing experiments, Scgb1a1-Cre®®'? and 
Rosa26R-fGFP**, Sftpc-Cre®®! (Sftpc-Cre®®!;R26R-fGFP and Scgblal- 
Cre®®!?;R26R-fGFP) mice on a C57BL/6 background were bred and maintained 
under specific-pathogen-free conditions at the Gurdon Institute of the University 
of Cambridge in accordance with UK Home Office project licence PC7F8AE82. All 
animal work was conducted under UK Home Office regulations, project licenses 
P83B37B3C and PC7F8AE82. 

Tamoxifen administration. Tamoxifen (Merck Sigma-Aldrich) was dissolved in 
Mazola corn oil (Merck Sigma-Aldrich) in a 20 mg ml! stock solution. Two doses 
of tamoxifen (0.2 mg per g body weight) were given via oral gavage every other 
day and lung tissues were collected two days after tamoxifen administration to 
isolate cells for lung organoids. For in vivo lineage tracing three doses of tamoxifen 
(0.2 mg per g body weight) were given via oral gavage over consecutive days and 
mice were injected two weeks later. 

Cells. MLg cells were purchased from ATCC. Cancer-associated fibroblasts 
(CAFs) isolated from MMTV-PyMT tumours and human normal fibroblasts 
(hNLFs) were a gift from E. Sahai. MMTV-PyMT cells were isolated from 
MMTV-PyMT tumours as previously described’. All other cell lines were pro- 
vided by the Cell Services Unit of The Francis Crick Institute. All cell lines were 
authenticated and tested for mycoplasma by the Cell Services Unit of The Francis 
Crick Institute. MMTV-PyMT cells were cultured on collagen-solution-coated 
dishes in DMEM/F12 (Thermo Fisher Scientific) with 2% fetal bovine serum 
(FBS; Labtech), 100 U ml~! penicillin-streptomycin (Thermo Fisher Scientific), 
20 ng ml~! EGF (Thermo Fisher Scientific) and 10 jg ml“! insulin (Merck 
Sigma-Aldrich). The collagen solution was made with 30 pg ml~! PureCol 
collagen (Advanced Biomatrix), 0.1% bovine serum albumin (BSA), 20 mM 
HEPES in HBSS (Thermo Fisher Scientific). HC11 cells were cultured in RPMI 
(Thermo Fisher Scientific) supplemented with 10% FBS, 100 U ml! penicillin— 
streptomycin, 10 ng ml”! EGF (Thermo Fisher Scientific) and 5 pg ml“ insulin. 
All other cell lines were cultured in DMEM (Thermo Fisher Scientific) supple- 
mented with 10% FBS and 100 U ml"! penicillin-streptomycin. All cells were 
cultured at 37°C and 5% CO3. 

Human samples. Human pulmonary breast cancer metastases from independent 
patients were obtained from the Grampian Biorepository, Aberdeen Royal 
Infirmary (REC approval: 16/NS/0055). Four samples were stained by immuno- 
histochemistry and immunofluorescence and proliferation of epithelial cells was 
quantified. Further information about the human samples used is provided in 
the Supplementary Information. 

Labelling system. A soluble peptide (SP)? and a modified TAT peptide* were 
cloned upstream of the mCherry cDNA, under the control of a mouse PGK 
promoter (sLP-mCherry). The sLP-mCherry sequence was cloned into a pRRL 
lentiviral backbone. 4T1, Renca, CT26 and HC11 cells were stably infected with 
sLP-mCherry and pLentiGFP lentiviral particles and subsequently sorted to isolate 
mCherry*GFP* cells. 

Induction of experimental metastases. Procedures were performed at the Francis 
Crick Institute in accordance with UK Home Office regulations under project 
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license P83B37B3C. Cancer cells were injected intravenously to generate metastases 
in the lung: 4T1 (1,000,000), Renca (500,000) or CT26 (200,000) cells were resus- 
pended in 100 1 PBS and injected into the tail vein of BALB/cJ mice. Mice were 
euthanized on the basis of a time period rather than on the basis of their clinical 
signs. Therefore, the experimental end point (time controlled, seven days unless 
otherwise specified) most likely occurred before a humane end point (as deter- 
mined by deterioration of heath conditions). All animals were monitored daily 
for unexpected clinical signs following the P83B37B3C licence guidelines and the 
principles set out in the NCRI Guidelines for the Welfare and Use of Animals in 
Cancer Research (UK). Deterioration of health conditions—such as reduction in 
food and water consumption, changes in the general appearance of the animal, or 
weight loss of 10% over a 24-h period—would result in animals being euthanized 
before the experimental end point. 

In vivo lineage-tracing experiments. Sftpc-Cre®®” and Scgb1a1-Cre®®” mice ona 
C57BL/6 background were injected into the tail vein with 175,000 MMT V-PyMT 
C57BL/6 cells and lungs were collected 4 weeks later, or with 700,000 E0771 cells 
and lungs were collected 12 days later. 

Tissue digestion for cell isolation or analysis. Lung tissues were dissociated as 
previously described’. In brief, lungs were removed at day 7 after tumour cell 
injection (unless otherwise specified), minced manually and then digested for 
30 min in a shaker at 37°C with a mixture of DNase I (Merck Sigma-Aldrich) 
and Liberase TM and TH (Roche Diagnostics) in HBSS solution. Samples were 
then washed, passed through a 100-1m filter and incubated in Red Blood Cell 
Lysis buffer (Miltenyi Biotec) for 3-5 min at room temperature. After a wash 
with MACS buffer (0.5% BSA and 250 mM EDTA in PBS), samples were passed 
through a 40-jm filter and a 20-\1m strainer-capped flow cytometry tube to 
generate a single-cell suspension to use for flow cytometric analysis or further 
purification. 

FACS analysis and cell sorting. Prepared single-cell suspensions of mouse lung 
tissues and in vitro cell lines were incubated with mouse FcR Blocking Reagent 
(Miltenyi Biotec) for 10 min at 4°C followed by an incubation with a mix of 
pre-labelled antibodies (antibody information is provided in the Supplementary 
Information) for 30 min at 4°C. After two washes with MACS buffer, dead cells 
were stained with DAPI. Flow cytometry analyses were carried out on a BD LSR- 
Fortessa (BD Biosciences) and FlowJo v.10.4.2 (FlowJo, LCC 2006-2018) was used 
for further analysis. All cell-sorting experiments were carried out using a BD Influx 
cell sorter (BD Biosciences). 

Tissue digestion and FACS analysis in ex-vivo lineage-tracing experiments. 
Lung tissues were dissociated with a collagenase-dispase solution as previously 
described’. In brief, after lungs were cleared by perfusion with cold PBS through 
the right ventricle, 2 ml dispase (50 U ml~!, BD Biosciences) was instilled into the 
lungs through the trachea until the lungs inflated, followed by instillation of 1% 
low melting agarose (Bio-Rad Laboratories) through the trachea to prevent leakage 
of dispase. Each lobe was dissected and minced into small pieces in a conical tube 
containing 3 ml PBS, 60 jl collagenase-dispase (Roche) and 7.5 jl of 1% DNase I 
(Merck Sigma-Aldrich) followed by rotating incubation for 45 min at 37°C. The 
cells were then filtered sequentially through 100- and 40-\1m strainers and cen- 
trifuged at 1,000 rp.m. for 5 min at 4°C. The cell pellet was resuspended in 1 ml 
of ACK lysis buffer (0.15 M NH4Cl, 10 mM KHCO3, 0.1 mM EDTA) and lysed 
for 90 s at room temperature. Six millilitres of basic F12 medium (Thermo Fisher 
Scientific) was added and 500 1] FBS (Fisher Scientific) was slowly added in the 
bottom of the tube. Cells were centrifuged at 1,000 r.p.m. for 5 min at 4°C. The cell 
pellet was resuspended in PF10 buffer (PBS with 10% FBS) for further staining. 
The antibodies used were as follows: CD45 (30-F11)—APC (BD Biosciences), CD31 
(MEC13.3)-APC (BD Biosciences) and EPCAM (G8.8)-PE-Cy7 (BioLegend). 
For antibody list see Supplementary Information. The MOFLO system (Beckman 
Coulter) was used for the sorting at Wellcome-MRC Stem Cell Institute Flow 
Cytometry Facility. 

Lung organoid assay. Lung organoid co-culture assays were previously 
reported’”*’. In brief, freshly sorted epithelial cells (EPCAM*CD45— 
CD31 Terl19° GFP ) from either the metastatic niche or the distal lung were 
resuspended in 3D basic medium (DMEM/F12, supplemented with 10% FBS, 
penicillin-streptomycin, 1 mM HEPES and insulin-transferrin-selenium (ITS) 
(Merck Sigma-Aldrich), and mixed with MACS-sorted CD31* lung stromal cells 
or MLg cells followed by resuspension in growth factor-reduced (GFR) Matrigel 
(BD Biosciences) at a ratio of 1:1. One hundred microlitres of mixture was then 
placed in a 24-well transwell insert with a 0.4-1ym pore (Corning). Distal lung or 
niche epithelial cells (10° to 2.5 x 10° cells) and 25,000 CD31* or MLg cells were 
seeded in each insert. Five hundred microlitres of 3D basic medium was placed 
in the lower chamber and medium was changed every other day. In addition, 
freshly sorted Scgb1a1-lineage labelled cells or Sftpc-lineage labelled cells were 
resuspended in 3D basic medium followed by mixing with GFR Matrigel retaining 
CD31* stromal cells as described above. For co-culture of lung epithelial cells with 
tumour cells, a mixture of 10? to 2.5 x 10° distal lung epithelial cells and 25,000 
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CD31* cells in Matrigel was placed in the Transwell insert, and 2,000 tumour 
cells were FACS-sorted from metastatic lungs and seeded in the lower chamber. 
Plates were scored for colony number after 14 days. Organoid-forming efficiency 
was calculated as the number of organoids formed per number of cells plated per 
well as a percentage. Quantification of distinct types of differentiated organoids 
was performed by scoring the organoids expressing SOX2 or SP-C and HOPX by 
immunofluorescence from at least five step sections (20 1m apart) per individual 
well. Bright-field images were acquired after 14 days using an EVOS microscope 
(Thermo Fisher Scientific). 

3D cell culture. Primary MMTV-PyMT actin-GFP cells were seeded at a den- 
sity of 5,000 cells per well in a collagen-solution-coated Alvetex Scaffold 96-well 
plate (ReproCELL). The following day, Ly6G* lung cells and/or Epcam* lung 
epithelial cells were sorted by MACS and seeded on top of the cancer cells at a 
density of 50,000 cells per well. In selected experiments, wells were supplemented 
with 4-hydroxy-TEMPO (200 |tM, Merck Sigma-Aldrich) or mouse WISP1 
antibody (250 ng ml~!, MAB1680, R&D Systems). The growth of GFP* cells 
was monitored daily for 6 days using the SteREO LumarV12 stereomicroscope 
(Zeiss), and images were quantified using Image] (NIH). For quantification, 
the Li’s minimum cross entropy thresholding algorithm was performed on the 
stacked images. 

For the CD104 staining experiment, EPCAM* lung cells were sorted from 
mouse lung tissues by MACS and seeded at a density of 1,500,000 cells per well 
on collagen-solution-coated Alvetex Scaffold 12-well inserts. After 48 h, MMTV- 
PyMT actin-GFP cells were seeded on top of the EPCAMt cells at a density of 
2,000 cells per scaffold insert. 

Immunofluorescence and immunohistochemistry. Mouse lungs were fixed in 
4% PFA in PBS for 24 h and embedded in paraffin blocks. Four-micrometre-thick 
tissue sections were cut, deparaffinized and rehydrated using standard methods. 
After heat-mediated antigen retrieval in citrate buffer (unless stated otherwise), 
sections were blocked with a solution of 1% BSA, 10% donkey serum in PBS. For 
antibody list, see Supplementary Information. 

mCherry and GFP staining. An overnight incubation at 4°C with goat GFP and 
rabbit mCherry antibodies was followed by 1 h incubation at room temperature 
with anti-goat Alexa Fluor 488- and anti-rabbit Alexa Fluor 555-conjugated anti- 
bodies (1:400; Thermo Fisher Scientific). Next, the slides were incubated with 
Sudan Black B for 20 min and mounted with Vectashield mounting medium with 
DAPI (Vector Laboratories). 

Lineage staining. An overnight incubation at 4°C with goat GFP antibody was 
followed by 45-min incubation at room temperature with secondary biotinylated 
antibodies. Next, the Vectastain Elite ABC kit (Vector Laboratories) was used 
according to the manufacturer's instructions. Cell nuclei were visualized with hae- 
matoxylin and analysis was performed on a Nikon Eclipse 90i light microscope 
and with NIS-elements software (Nikon). 

WISPI staining. An overnight incubation at 4°C with goat GFP and rabbit WISP1 
antibodies was followed by 30-min incubation at room temperature with anti-goat 
Alexa Fluor 488 and anti-rabbit Alexa Fluor 555 (1:500; Thermo Fisher Scientific). 
Next, the slides were incubated with Sudan Black B for 20 min and mounted with 
Vectashield mounting medium with DAPI (Vector Laboratories). 

Ki67 staining. EPCAM*CD45- CD31 Terl119 GFP~ cells were sorted from lung 
suspensions, plated on polylysine-coated glass coverslips for 15 min at room tem- 
perature and fixed in 4% PFA in PBS for 10 min. After fixation, cells were perme- 
abilized with 0.1% Triton X-100 in PBS for 5 min and incubated with a blocking 
solution (1% BSA, 10% goat serum, 0.3 M glycine and 0.1% Tween-20 in PBS) 
for 1 h at room temperature. Next, cells were incubated overnight with a rabbit 
Ki67 antibody diluted in blocking solution followed by a 1 h incubation with a 
goat anti-rabbit Alexa Fluor 488 antibody (1:500; Thermo Fisher Scientific). 
Finally, cells were mounted with Vectashield mounting medium with DAPI for 
imaging. 

E-cadherin staining. CD49fCD104*CD45- CD31 Terl19- GFP cells were 
sorted from lung suspensions, cytospun on glass slides and fixed in 4% PFA 
in PBS for 10 min. Next, cells were permeabilized with 0.5% Triton X-100 for 
30 min and incubated in blocking solution (4% BSA, 0.05% Tween-20 in 
PBS) for 45 min at room temperature. Then, cells were incubated with a rat 
E-cadherin antibody in blocking solution overnight at 4°C followed by an incu- 
bation with a goat anti-rat Alexa Fluor 647 antibody (1:500; Thermo Fisher 
Scientific). Finally, cells were mounted with Vectashield mounting medium 
with DAPI for imaging. 

CD104 staining. EPCAM* cells were sorted by MACS and plated on Alvetex 
scaffold inserts as described above. Seven days after plating the whole scaffold was 
collected, washed with PBS and incubated in blocking solution (10% goat serum 
in PBS) for 1 h at room temperature. Next, the samples were incubated with a 
conjugated CD104—eFluor660 antibody (1:100 in PBS with 1:10 FcR blocking 
(Miltenyi Biotec)) for 1 h at room temperature. Then, the samples were fixed with 
4% PFA in PBS for 10 min and mounted with Vectashield mounting medium 


with DAPI. Images were captured with the Axio Scan.Z1 slide scanner (Zeiss, 
Germany). 
Lung organoid staining. Cultured organoids were fixed with 4% PFA in PBS for 2-4 h 
at room temperature followed by immobilization with Histogel (Thermo Fisher 
Scientific) for paraffin embedding. At least five step sections (20 jm apart) 
per individual well were stained. Fluorescence images were acquired using a 
confocal microscope Leica TCS SP5 (Leica Microsystems). All the images were 
further processed with Fiji software. 
TTFI and Ki67 co-staining. Target retrieval solution pH 9 (Agilent DAKO) was 
used for antigen retrieval. For histology, 1-h incubation at room temperature 
with mouse TTF1 antibody was followed by 45-min incubation at room tem- 
perature with secondary biotinylated antibodies. Next, the Vectastain Elite ABC 
kit (Vector Laboratories) was used according to the manufacturer’s instructions. 
Cell nuclei were visualized with haematoxylin and analysis was performed on 
a Nikon Eclipse 90i light microscope and with NIS-elements software (Nikon). 
For immunofluorescence, 1 h incubation at room temperature with mouse TTF1 
and rabbit Ki67 antibodies was followed by 45 min incubation at room temper- 
ature with anti-mouse Alexa Fluor 555 and anti-rabbit Alexa Fluor 488 (1:250; 
Thermo Fisher Scientific). Next, the slides were incubated with Sudan Black B 
for 20 min and mounted with Vectashield mounting medium with DAPI (Vector 
Laboratories). 

All images were captured with a Zeiss Upright710 confocal microscope or a 
Zeiss Upright780 confocal microscope unless otherwise stated. 
RT-qPCR. RNA preparation was performed using the MagMax-96 Total RNA 
Isolation Kit (Thermo Fisher Scientific). CDNA synthesis was performed using 
a SuperScript III First-Strand Synthesis System (Thermo Fisher Scientific) 
according to the manufacturer’s protocol. Quantitative real-time PCR samples 
were prepared with 50-100 ng total cDNA for each PCR reaction. The PCR, data 
collection and data analysis were performed on a 7500 FAST Real-Time PCR 
System (Thermo Fisher Scientific). Glyceraldehyde 3-phosphate dehydrogenase 
(GAPDH) was used as an internal expression reference. A list of primers used can 
be found in the Supplementary Information. 
Anti-WISP1 treatment in vivo. BALB/c] female mice (6-8 weeks old) were 
administered with WISP1 antibody or a control-IgG antibody (5 jg AF1680 and 
5 pg MAB1680, R&D Systems) via intra-tracheal injection (50 jl per mouse). The 
following day, mice were intravenously injected with 250,000 4T1 cells. Anti- 
WISP1 or control-IgG treatment was repeated daily via a second intra-tracheal 
injection on day 4 and intra-peritoneal injections on days 2, 3, 5 and 6. Mice were 
collected 7 days after the first treatment and lungs were embedded, cut and stained 
with haemotoxylin and eosin (H&E). The lung metastatic burden was assessed by 
counting the number of metastases on four levels (100-j1m intervals) from two 
lung lobes ( = 10 per group). 
EdU in vitro proliferation assay. MMT V-PyMT actin-GFP cells were seeded at 
a density of 10,000 cells per well into collagen-solution-coated six-well plates. The 
following day, Ly6G* lung cells and/or EPCAM* lung cells were sorted by MACS 
and added to the wells at a density of 100,000 cells per well. After 60 h, wells were 
supplemented with 20 1M EdU (5-ethynyl-2’-deoxyuridine). Cells were collected 
6 h later and EdU incorporation was assessed using the Click-iT Plus EdU Flow 
Cytometry Assay Kit (Thermo Fisher Scientific), according to the manufacturer's 
instructions. Sample data were acquired on a BD LSR-Fortessa flow cytometer and 
analysed using FlowJo 10 software. 
Conditioned medium preparation and vesicle isolation. Labelling-4T1 cells 
were plated on 10-cm Petri dishes. When cells were 80% confluent, 10 ml DMEM 
with 10% FCS was added to be conditioned for 48 h. The conditioned medium 
preparation and vesicle isolation were performed as previously described”. In 
brief, the medium was collected and spun at 300g for 10 min. Next, the super- 
natant was collected and spun at 2,000g for 10 min. The supernatant after this 
second centrifugation was collected and used as conditioned medium. For vesicle 
isolation, the conditioned medium was subsequently ultracentrifuged at 10,000g 
for 30 min and at 100,000g for 70 min. The vesicle pellet at this stage was washed 
with PBS, spun at 100,000g for 70 min and resuspended again in PBS for in vitro 
uptake experiments. 
ImageStream analysis. Image stream analyses were carried out on an ImageStream 
Mark X II Imaging Flow Cytometer (Amnis Merck). The acquired data were ana- 
lysed using IDEA software (Amnis Merck). 
Electron microscopy. Experiments were performed on glass bottom dishes 
with a numbered grid (MatTek) to enable subsequent location of the same cell 
imaged by confocal microscopy. After confocal imaging, cells were fixed in 8% 
formaldehyde in 0.1 M phosphate buffer (pH 7.4) added in equal quantities to 
cell medium for 15 min and then further fixed in 2.5% glutaraldehyde and 4% 
formaldehyde in 0.1 M phosphate buffer (pH 7.4) for 1 h and then processed 
using the National Center for Microscopy and Imaging Research protocol*’. For 
transmission electron microscopy, 70-nm serial sections were cut using a UC6 
ultramicrotome (Leica Microsystems) and collected on formvar-coated slot grids. 


No post-staining was required owing to the density of metal deposited using the 
NCMIR protocol. Images were acquired using a 120-kV Tecnai G2 Spirit trans- 
mission electron microscope (FEI Company Thermo Fisher Scientific) and an 
Orius CCD camera (Gatan). 

RNA sequencing sample preparation. Bulk RNA sequencing. CD45~ Ter119— 
(CD45_ ) cells were sorted from single-cell suspensions of metastatic lungs stained 
with anti-mouse CD45 and Ter119 antibodies and DAPI. RNA isolation was 
performed using the MagMax-96 Total RNA Isolation Kit (Thermo Fisher 
Scientific), which enables high-quality RNA extraction from samples with low 
cell numbers (<10,000 cells). RNA quality for each sample was assessed using 
the Agilent RNA 6000 Pico Kit (Agilent Technologies). RNA was amplified and 
analysed at the Barts and London Genome Centre. 

Single-cell RNA sequencing. CD45 Ter119~ cells were sorted from single-cell sus- 
pensions of metastatic lungs stained with anti-mouse CD45 and Ter119 antibodies 
and DAPI. Library generation for 10x Genomics analysis were performed following 
the Chromium Single Cell 3’ Reagents Kits (10x Genomics) and sequenced on an 
Hiseq4000 (Illumina), to achieve an average of 50,000 reads per cell. 
Determination of intracellular ROS levels. Single-cell suspensions from mouse 
lungs were incubated with mouse FcR blocking reagent for 5 min on ice and sub- 
sequently incubated with CellROX Deep Red Reagent (Thermo Fisher Scientific) 
for 30 min at 37°C following the manufacturer’s recommendations. Next, cells 
were washed twice with MACS buffer, stained with DAPI and analysed by flow 
cytometry. 

Quantitative proteomic analysis of Ly6G cells. Neutrophils were sorted by 
FACS from single-cell suspensions of metastatic lungs stained with a conjugated 
anti-mouse Ly6G-APC antibody (three samples from independent sorts). Ly6G 
cells from the metastatic niche (mCherry*) and the distal lung (mCherry ) were 
digested into peptides using a previously described protocol and analysed by 
data-independent acquisition mass spectrometry on a Orbitrap Fusion Lumos 
instrument (Thermo Fisher Scientific). A hybrid spectral library was generated 
using the search engine Pulsar in Spectronaut Professional+ (v.11.0.15038, 
Biognosys) by combing data-dependent acquisition runs obtained from a pooled 
sample of Ly6G cells, and the data-independent acquisition data. Data analysis and 
differential protein expression was performed using Spectronaut Professional+. 
A detailed description of sample processing, data acquisition and processing can 
be provided on request from the corresponding authors. 

Bioinformatics analysis. Bulk RNA sequencing. The sequencing was performed 
on biological triplicates for each condition, generating approximately 35 million 
76-bp paired-end reads. The RSEM package“ (v.1.2.29) and Bowtie2 were used 
to align reads to the mouse mm10 transcriptome, taken from the known-gene 
reference table available from University of California Santa Cruz (https://genome. 
ucsc.edu/). For RSEM, all parameters were run as default except “-forward-prob” 
which was set to 0.5. Differential-expression analysis was carried out with DESeq2 
package*® (v.1.12.4) in R v.3.3.1 (https://www.r-project.org/). Genes were consid- 
ered to be differentially expressed if the adjusted P was less than 0.05. Differentially 
expressed genes were taken forward and their pathway and process enrichments 
were analysed using Metacore (https://portal.genego.com). Hypergeometric 
test was used to determine statistical enriched pathways and processes and the 
associated P-value was corrected using the Benjamini-Hochberg method. GSEA 
(v.2.2.3)4*4” was carried out using ranked gene lists using the Wald statistic and 
the gene sets of C2 canonical pathways and C5 biological processes. All param- 
eters were kept as default except for enrichment statistic (classic) and maximum 
size, which was changed to 5,000. Gene signatures with FDR q-value equal to or 
less than 0.05 were considered statistically significant. A weighted Kolmogorov- 
Smirnov-like statistic was derived and the associated P-value was corrected with 
the Benjamini-Hochberg method. 

Single-cell RNA sequencing. Raw reads were initially processed by the Cell Ranger 
v.2.1.1 pipeline, which deconvolved reads to their cell of origin using the UMI 
tags, aligned these to the mm 10 transcriptome using STAR (v.2.5.1b) and reported 
cell-specific gene expression count estimates. All subsequent analyses were per- 
formed in R v.3.4.1 using the cellrangerRkit, monocle and pheatmap packages. 
Genes were considered to be ‘expressed if the estimated (logo) count was at least 
0.1. Primary filtering was then performed by removing from consideration: genes 
expressed in fewer than 20 cells; cells expressing fewer than 50 genes; cells for 
which the total yield (that is, sum of expression across all genes) was more than 
two standard deviations from the mean across all cells in that sample; and cells 
for which mitochondrial genes made up greater than 10% of all expressed genes. 
PCA decomposition was performed and, after consideration of the eigenvalue 
‘elbow-plots; the first 25 components were used to construct t-SNE plots for both 
samples. Niche cells expressing Epcam were subdivided into those also expressing 
Cdh1 and those not expressing Cdh1. Other genes expressed in at least 50% of cells 
in a given group were said to be co-expressed and the set of genes co-expressed in 
one or more groups was presented as a heat map, with the columns (cells) clustered 
using the standard Euclidean hierarchical method. 
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Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 
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Extended Data Fig. 1 | See next page for caption. 
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Extended Data Fig. 1 | The mCherry-niche system in vitro. a, sLP- 
mCherry design. b, Fluorescence images of labelling-4T1 cells after 
thawing. Scale bar, 10 um. c, Representative FACS plot of labelling-4T1 
cells. d, In vitro cultures of the indicated cell types with LCM: culture 
scheme and representative fluorescence images of HC11 (mouse 
mammary epithelial cells) and hNLF (human normal lung fibroblasts) 
with LCM (scale bar, 10 xm). e, FACS plots of 4T1, HC11, RAW264.7 


(mouse macrophages), hNLF and mouse breast CAFs cultured with LCM. 


f, FACS analysis of 293T cells cultured with LCM, at different time points 
after LCM removal (black dots); white dots show the theoretical decrease 
considering the cell proliferation rate only (the amount of 293T cells 
labelled with mCherry after 24 h incubation with LCM was set to 100%). 
g, Representative fluorescence image of 4T1-CD63-GFP cells cultured 
with LCM. Scale bars: main panels, 5 um; enlarged region, 1 jm. 

h, Representative correlative light and electron microscopy of 


labelling-4T1 cells showing re-uptake of sLP-mCherry (n = 5 different 
cells analysed). Top left, bright-field image overlaid with mCherry 
immunofluorescence (~700 nm optical section). Bottom left, electron 
microscopy of the same cell (~70-nm section thickness). Centre, best 
approximation of immunofluorescence-bright-field-electron microscopy 
overlay (scale bar, 5 j1m). Right, electron microscopy of the outlined 
regions (centre, labelled a-c) (black arrows point at vesicular structures 
containing mCherry; scale bar, 1 pm). i, j, Analysis of in vitro labelling 
potential of soluble fraction and extracellular vesicles isolated from LCM 
by FACS. i, Schematic representation of LCM fractionation. j, HC11 cells 
cultured with either LCM, soluble fraction after depletion of extracellular 
vesicles (soluble) or purified extracellular vesicles. k, ImageStream analysis 
of mCherry* extracellular vesicles in LCM (16% of total extracellular 
vesicles are mCherry*). Data are representative of three (b), ten (c) or 
two (d-g, j, k) independent experiments. 
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Extended Data Fig. 2 | The mCherry-niche system in vivo. a,b, Distance — (n = 14 mice). Right, analysis with all cancer cell frequencies (n = 31 


of labelled cells within metastases. a, Representative fluorescence images mice). Statistical analysis by Pearson correlation. d-f, CD45* cell 

(lines measure the maximum distance of labelled cells (mCherry*) from frequency on live cells in distal lung, mCherry* niche and not- 
labelling-4T1 cells (mCherry*GFP*); scale bar, 50 jum). b, Quantification injected naive lungs by FACS. d, BALB/c mice injected with labelling-4T1 
of labelling distance in micro-metastases (n = 11) and macro-metastases cells (n = 5 mice per group). e, BALB/c mice injected with labelling-HC11 
(n = 4). c, Correlation between the percentage of mCherry-labelled cells (n = 4 mice). f, RAG1-knockout mice injected with labelling-4T1 
niche cells and the percentage of cancer cells in metastatic lungs analysed cells (n = 10 mice). Statistical analysis by paired two-tailed t-test. Data are 


by FACS. Left, analysis of lungs with a small number of cancer cells represented as mean + s.e.m. 
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Extended Data Fig. 3 | mCherry*-niche neutrophils increase ROS 
production. a, b, CD11b* (a) and Ly6G?* (b) cell frequencies among live 
cells in distal lung and mCherry* niche by FACS (n = 9 mice per group). 
c, Enriched processes by MetaCore analysis and GSEA based on proteomic 
data by comparing mCherry*-niche ( = 3) and distal lung (n = 3) 
neutrophils; dominant mCherry*-niche proteins were obtained by using 
WebGestalt (http://www.webgestalt.org/option.php). d, PCA of proteins 
found in unlabelled or mCherry*-niche neutrophils (n = 3, each with 10 
mice, small circles; large circles represent the average of the triplicates). 
e, f, Representative FACS plot (e) and scatter plot (f) of intrinsic ROS in 
Ly6G* cells (n = 6 mice). g, GFP signal quantification of 3D co-culture 
with GFP* MMTV-PyMT cancer cells and MACS-sorted Ly6G* cells 


from either naive or metastatic lungs with or without the ROS inhibitor 
TEMPO (n = 3, each with 3 technical replicates). Data are normalized 

to cancer cell growth (statistical analysis on biological replicates). 

h, Representative cancer cell growth on the scaffold (from 14 independent 
experiments): integrated density of the GFP signal was measured on the 
scaffold using ImageJ and the corresponding fluorescent image of GFP* 
cancer cell growth (scale bar, 400 jum). Statistical analysis by paired two- 
tailed t-test (a, b, f), hypergeometric test with Benjamini-Hochberg 
correction (c, Metacore), weighted Kolmogorov-Smirnov-like statistic 
with Benjamini-Hochberg correction (c, GSEA) and two-way ANOVA (g). 
Data are presented as mean + s.d. (f) and mean + s.e.m. (g). 
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Extended Data Fig. 4 | RNA sequencing of non-immune mCherry*- point. c, MetaCore analysis of genes differentially expressed in RNA-seq 
niche cells. a, b, GSEA of upregulated genes in mCherry*-niche cells. data, comparing early (7 = 3) or late (1 = 3) mCherry* samples versus 
a, Percentage of correlating processes related to the indicated activity. the respective mCherry" samples (see Fig. 3a, b). Statistical analysis by 


b, Specific signalling pathways (indicated by the * ina) at early or latetime hypergeometric test with Benjamini-Hochberg correction. 
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Extended Data Fig. 5 | WISP1 supports metastatic growth. 

a, b, Representative immunofluorescence images of lung metastatic 
tissues (n = 2 mice) stained for GFP (green) to detect labelling-4T1 cells, 
WISP1 (red) and DAPI (blue), showing distal lung and metastatic areas 
(a; scale bar, 50 pm), and a representative image showing the enrichment 
of WISP1* cells within lung metastasis including niche cells (white 
arrows) (b; scale bar, 50 jum). c-e, WISP 1-blocking antibody treatment 
in vivo. c, Experimental design (IT, intratracheal injection; IP, 


Level 1 Level 2 


—— - 
50um 


p=0.033 
oO IgG 
@ Anti-WISP1 


intraperitoneal injection). d, Metastatic outcome measured as the 
percentage of lung area covered by metastases (quantification was 
performed on two lung levels 100 jm apart). e, Representative H&E 
staining (n = 5 mice per group; black arrows show metastatic foci). Scale 
bar, 500 pm. Two experiments with lower overall metastatic frequency are 
quantified in Fig. 3e. Statistical analysis by two-way ANOVA (d). Data are 
presented as mean + s.e.m. 
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Extended Data Fig. 6 | Lung pneumocytes react to cancer cells in human 
breast pulmonary metastases. a—c, Histology of sections of human breast 
tumour lung metastases. a, Representative image of distal lung (scale bar, 
100 um). b, Image from the tumour-lung interface showing expression 

of the pneumocyte marker thyroid transcription factor 1 (TTF1) (scale 
bar, 50 um). c, Representative histology of the metastatic border (scale 

bar, 100 jm). d-f, Alveolar cell proliferation in human breast tumour 
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lung metastases analysed by immunofluorescence. Representative images 
from distal lung (d) and metastatic border (e) showing TTF1 (red), Ki67 
(green) and DAPI (blue). Scale bars: all 100 jm, except e (far right), 50 pm. 
f, Quantification of alveolar proliferation. Box edges show 25th and 75th 
percentiles, the horizontal line shows the median and whiskers show 

the range of values. Statistical analysis by paired two-tailed t-test. Tissue 
sections from n = 4 independent patients were analysed. 
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Extended Data Fig. 7 | Epithelial cells support cancer cell growth 

ex vivo. a, GEP* MMTV-PyMT cancer cell proliferation in 2D co-culture 
with MACS-sorted EPCAM* and Ly6G* cells stained with EdU and 
analysed by FACS (n = 3 independent experiments). Data are normalized 
to cancer cell proliferation. b-d, Three dimensional co-culture of GFPT 
MMTV-PyMT cancer cells with MACS-sorted EPCAM* and Ly6G" cells. 


Day 2 
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b, Co-culture scheme. c, Representative images from four independent 
experiments (day 4; scale bar, 400 jum). d, Quantification of GFP 

signal. Data are normalized to cancer cell growth (n = 4 independent 
experiments (dots), each with 3-4 technical replicates). Statistical analysis 
of biological replicates by one-sample two-tailed t-test (a) and two-way 
ANOVA (d). Data are represented as mean + s.e.m. 
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Extended Data Fig. 8 | sCRNA-seq analysis reveals different sub-pools 
of stromal cells in the niche. a, t-SNE plots of CD45~ cells isolated from 
distal lung (n = 1,996) or mCherry* niche (n = 1,473) after sCRNA-seq 
analysis. Stromal cells are coloured on the basis of expression levels of 
the indicated genes. b, t-SNE niche plots from data in a; each plot shows 
(in red) the cells expressing the indicated stromal marker. c, MetaCore 
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pathway enrichment analysis using the list of genes detected in at least 
50% of the indicated marker-defined cells (n = 66 THY1* cells, n = 175 
PDGEFRBI* cells, n = 322 PDGFRAT cells, n = 330 ACTA2* cells, n = 25 
LGR6* cells). Statistical analysis by hypergeometric test with Benjamini- 
Hochberg correction. 
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Extended Data Fig. 9 | mCherry*-niche epithelial cells are enriched 

for stem cell markers. a, Representative FACS plots showing Lin™ 
(CD45~CD31~Ter119~) cells in distal lung and mCherry™ niche from 
labelling-4T 1-injected mice (quantification in Fig. 4i). b, c, Scatter plots 
showing FACS quantification of EPCAM*SCA1* cell frequency on Lin™ 
(CD45~CD31~Ter119~) cells in distal lung and mCherry* niche with 
injection of labelling-RENCA (b; 1 = 5 mice) and labelling-CT26 (¢; n = 4 
mice). d-f, Scatter plot of CD49f*CD104* cell frequency among Lin™ 
(CD45~CD31~Ter119~) cells in distal lung and mCherry* niche detected 
by FACS (d; n = 5 mice), representative FACS plots (e) and representative 


immunofluorescence image of FACS-sorted mCherry* -niche 
CD49ftCD104* cells stained for E-cadherin (green) and with DAPI (blue) 
(f; scale bar, 20 sm). g-i, Three-dimensional co-culture of GFP MMTV- 
PyMT cancer cells with MACS-sorted EPCAM{t‘ cells. g, Quantification of 
integrin 84 (CD104) expression on EPCAM‘ cells. h, Number of CD104+ 
cells proximal to cancer cells (n = 4 from three independent sorts). 

i, Representative immunofluorescence image from the co-culture stained 
for CD104 (red), GFP* cancer cells (green) and with DAPI (blue). Scale 
bar, 20 jum. Statistical analysis of biological replicates by paired two-tailed 
t-test (b-d, g). Data are presented as mean + s.e.m. 
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Extended Data Fig. 10 | See next page for caption. 
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Extended Data Fig. 10 | Cancer cells change lung epithelial cell-lineage 
commitment ex vivo. a, Representative immunofluorescence images of 
lung metastatic sections (n = 3 mice) co-stained for an airway marker 
(SCGB1A1 (top; white) or SOX2 (bottom; white)) and mCherry (red), and 
with DAPI (blue). Scale bar, 100 1m. b, c, Lung organoids from EPCAMt 
FACS-sorted cells in co-culture with either lung stromal CD31* cells 

or MLg fibroblasts, alone or in the presence of non-labelling 4T1-GFP 
cells from metastatic lungs in the lower chamber; quantification (b) and 
representative bright-field images (c; scale bar, 150 jum) of organoids. 

d, e, Lung organoids with Scgb1a1-Cre®” lineage cells with or without 


4T1-GFP: quantification (d) and representative bright-field images 

(e; scale bar, 150 jum). f, Representative staining of lineage cells in 
metastatic lungs from Scgb1a1-Cre"®” mice injected with MMT V-PyMT 
cancer cells. Scale bars: top left, 200 jm; other panels, 50 jum; top middle 
inset, 25 zm. Data are generated with sorted EPCAM* (b) or club- 
lineage cells (d) and represented as cumulative percentage presented 

as mean + s.d. of three co-cultures per sorting. Statistical analysis by 
two-tailed f-test on original non-cumulative values (b, d). Images are 
representative of three organoid cultures (c, e). 
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n/a | Confirmed 


x| The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


x| A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


x| A description of all covariates tested 


x| A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 


x Give P values as exact values whenever suitable. 
x For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 
x For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 
x Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection Flow cytometry: samples were run on a BD 671 LSR-Fortessa (BD Biosciences, USA) using the BD FACSDiva software v8.0.1. 
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Data analysis Statistics: analyses were performed using Prism software (version 7.0c, GraphPad Software, USA) with the exception of the qRTPCR data, 
‘or which R was used. 

Fluorescence imaging: FiJi (version 2.0.0-rc-68/1.52g, ImageJ) and Adobe Photoshop CC 2018 (version 19.0, Adobe, USA) were used to 
analyse fluorescence images. 

mmunohistochemistry: images were acquired using NIS-elements software (version 4.51, Nikon, Japan) 

Flow cytometry: data analyses were carried out using FlowJo 10.4.2 (FlowJO, LCC 2006-2018, USA). 

mageStream: analyisis were performed using IDEA software (version 6.2, IDEAS Amnis, Merck, USA) 

Proteomics: data analysis and differential protein expression was performed using Spectronaut Professional+. A detailed description of 
sample processing, data acquisition and processing are available on request. 

RNA sequencing: the RSEM package (version 1.2.29) and Bowtie2 were used to align reads to the mouse mm10 transcriptome. 
Differential expression analysis was carried out with DESeq2 package9 (version 1.12.4) within R version 3.3.1 (https://www.rproject.org/). 
Gene Set Enrichment Analysis, GSEA, (version 2.2.3) was carried out using ranked gene lists using the Wald statistic and the gene sets of 
C2 canonical pathways and C5 biological processes. Heatmaps of differentially expressed genes were generated using the gplots (Gregory 
et al., gplots: Various R Programming Tools for Plotting Data. R package version 3.0.1. (2016). https://CRAN.R-project.org/ 
package=gplots) CRAN package (version 3.0.1). 

Single-cell RNA-sequencing: the Cell Ranger v2.1.1 pipeline was used to process raw reads, using STAR (v2.5.1b) to align to the mm10 
transcriptome, deconvolve reads to their cell of origin using the UMI tags and report cell-specific gene expression count estimates. All 
subsequent analyses were performed in R-3.4.1 using the cellrangerRkit, monocle and pheatmap packages. 


See methods for further details 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


The RNA sequencing datasets (GSE117930) and the single cell RNA sequencing datasets (GEO13150) are deposited in the Gene Expression Omnibus (GEO, NCBI) 
repository. The proteomic datasets are deposited in PRoteomics IDEntifications (PRIDE) repository (PXDO10597). 


Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


X | Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 
Sample size Sample sizes were estimated based on previous experiments conducted in our laboratory, providing sufficient numbers of mice in each group 
to yield a two-sided statistical test, with the potential to reject the null hypothesis with a power (1 - beta) of 80%, subject to alpha = 0.05. 
Data exclusions No data was excluded 


Replication Unless otherwise specified in the figure legends, experiments were reproduced in at least two independent experiments. 


Randomization |The majority of the in vivo data generated in this study involved analysis between different areas of the same tissue in each mouse, therefore 
both control and experiment cannot be randomized. The experiment involving a therapeutic treatment with the antibody was performed on 
litter mice all injected with tumour cells and then randomized for the antibody treatment. 


Blinding Investigators were not blinded for studies involving the analysis of the Niche versus distant lung cells as the cells were from the same samples 
and the two subsets could only be discriminated by FACS analysis itself. Experiments using sorted and stained cells (niche versus distant lung), 
scaffold assays and organoid assays were blinded at quantification. For the in vivo treatment experiment with antiWisp1, the quantification of 
metastatic burden between the two group was performed blinded. 
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We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
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x| Antibodies x ChIP-seq 
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Antibodies used TIBODY_ COMPANY_ CATALOGUE No_ CLONAL_ (CLONE)_ DILUTION (Technique) 
cetylated-tubulin_ Sigma-Aldrich_17451_ Mouse monoclonal_ (6-11B-1)_ 1:1000 (IF) 
C10 (SCGB1A1)_ Santa Cruz_ sc-25555_ Rabbit polyclonal_ (FL-96)_ 1:200 (IF) 
D11b-APC_ Biolegend_ 10121_ Rat monoclonal_(M1/70)_ 1:100 (FC) 

D11b-APCCy7_ Biolegend_ 101226_ Rat monoclonal_ (M1/70)_ 1:100 (FC) 


A 
A 
Ci 
C 
Cc 
CD45-BV421_ Biolegend_ 103133 Rat monoclonal_ (30-F11)_ 1:200 (FC) 
Cc 
C 
Cc 
Cc 
Cc 


D45-APC_ eBioscience_ 17-0451-83_ Rat monoclonal_ (30-F11)_ 1:200 (FC) 
D45-APC-eFluor780_ eBioscience_ 47-0451-82_ Rat monoclonal_ (30-F11)_ 1:200 (FC) 
D49f-PerCP-eFluor710_ eBioscience_ 46-0495-82_ Rat monoclonal_ (ebioGOH3)_ 1:200 (FC) 
D104-eFluor660_ eBioscience_ 50-1049-82_ Rat monoclonal_ (439-9b)_ 1:100 (FC; IF) 
D326(EPCAM)-APC_ eBioscience_ 17-5791-81_ Rat monoclonal_ (G8.8)_ 1:200 (FC) 
D326(EPCAM)-APC750Fire_ Biolegend_ 118230_ Rat monoclonal_ (G8.8)_ 1:200 (FC) 
E-CADHERIN_ Abcam_Ab11512_ Rat monoclonal_ (DECMA-1)_ 1:200 (IF) 
GFP_Abcam_ab6673_ Goat polyclonal_ 1:300 (IF) 

HOPX_ Santa Cruz_ sc-30216_ Rabbit polyclona_| (FL-73)_ 1:250 (IF) 

Ki67_ Abcam_Ab16667 _Rabbit monoclonal_ (SP6)_ 1:300 (IF) 

Ly6A/E(SCA-1)-APC_ Biolegend_ 108111_ Rat monoclonal_ (D7)_ 1:200 (FC) 
Ly6A/E(SCA-1)-APC750Fire_ Biolegend_ 127652 _ Rat monoclonal_ (D7)_ 1:200 (FC) 
Ly6A/E(SCA-1)-BV786_ BD Bioscience_ 563991_ Rat monoclonal_(D7)_ 1:200 (FC) 
Ly6G-APC_ BD Bioscience_ 560599 Rat monoclonal_ (1A8)_ 1:150 (FC) 
Ly6G-APC750Fire_ Biolegend_ 127652_ Rat monoclonal_ (1A8)_ 1:150 (FC) 
Ly6G-V450_ BD Bioscience_ 560603_ Rat monoclonal_ (1A8)_ 1:150 (FC) 

mCHERRY_ Abcam_ ab183628_ Rabbit polyclonal_ 1:750 (IF) 

SOX2_ eBioscience_ 14-9811-80_ Rat monoclonal_ (Btjce)_ 1:500 (IF) 

SP-C_ Santa Cruz_sc-7706_ Goat polyclonal_ (M-20)_ 1:200 (IF) 

TER-119_ Biolegend_ 116233 Rat monoclonal_ (TER-119)_ 1:200 (FC) 

TTF1_ DAKO_M3575_ Mouse monoclonal_ (8G7G3/1)_ 1:50 (IF) 

WISP1_ Abcam_Ab178547_ Rabbit polyclonal_ 1:100 (IF) 


Qa 


Validation The antibodies used have been validated accordingly to manufacturer's instructions. Mouse lung cell suspensions were used to 
validate FACS antibodies. Human or mouse lung sections were used to validates the antibodies for IF or IHC straining. 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) MLg cells (murine normal lung fibroblasts) were purchased from ATCC (USA). CAF (cancer associated fibroblasts) isolated 
from MMTV-PyMT tumours and human normal fibroblast (hNLF) (FVB background) were a gift from E.Sahai. All other cell 
lines (4T1, E0771, HC11, RAW264.7, RENCAm CT26) were provided by the Cell Services Unit of The Francis Crick Institute. 
For primary cells, MMTV-PyMT cells were isolated from growing MMTV-PyMT tumours (FVB or C57/Blackl6 background). 


Authentication Short Tandem Repeat (STR) was used to identify all cell lines used while SPID was used to confirm the species of origin. 


1240120 


IC 


Mycoplasma contamination All cells are routinely tested for Mycoplasma by the Cell Services Unit of The Francis Crick Institute. 
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Commonly misidentified lines No commonly misidentified lines were used. 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals Mice used for this experiments were females of wild type Balb/c) or C57BL/6 background or MMTV-PyMT/actin-GFP and RAG1 
ko mice in FVB background, MMTV-PyMT, Sftpc-CreERT2, Rosa26R-YFP, Scgb1a1-CreERT2 and Rosa26R-fGFP mice in C57BL/6 
background. All mice used were females between 4 and 10 weeks of age. All mice were bred in house at The Francis Crick 
Biological Research Facility or The Gurdon Institute of University of Cambridge, according to UK Home Office Regulations. 


Wild animals This study did not involve wild animals. 
Field-collected samples This study did not involve samples collected from the fields. 
Ethics oversight All experiments were approved by Francis Crick/Cambridge University ethical review committees and conducted according to UK 


Home Office Regulations (project license PPL/80/2531 and PC7F8AE82). 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 
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Flow Cytometry 


Plots 
Confirm that: 


x | The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 


x | The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a ‘group’ is an analysis of identical markers). 


|X| All plots are contour plots with outliers or pseudocolor plots. 


x | Anumerical value for number of cells or percentage (with statistics) is provided. 


Methodology 


Sample preparation Lungs were minced and digested for 30 min in a shaker at 372C with a mixture of DNase I (Merck Sigma-Aldrich, Germany) and 
Liberase TM and TH (Roche Diagnostics, Switzerland) in HBSS solution. Samples were then washed, passed through a 100 m filter 
and incubated in Red Blood Cell Lysis buffer (Miltenyi Biotec, Germany) for 3-5 min at room temperature. After a wash with 
MACS buffer (0.5% BSA and 250 mM EDTA in PBS), samples were passed subsequently through a 40 m filter and a 20 m 
strainercapped flow cytometry tube for single cell suspension to use for flow cytometric analysis or further purification. 

For ex vivo lineage tracing experiments: Lungs were cleared by perfusion, dissected, minced and dissociated with a collagenase/ 
dispase solution containing DNase1 for 45 min at 37°C. Cells were then filtered sequentially through 100- and 40-um strainers, 
centrifuged at 1000rpm for 5 min at 4°C and resuspended in 1ml of ACK lysis buffer (0.15 M NH4CI, 10mM KHCO3, 0.1 mM 
EDTA) for 90 s at room temperature for red blood cell lysis. Cells were washed with basic F12 media containing FBS 
(ThermoFisher Scientific, USA), centrifuged and resuspended in PF10 buffer (PBS with 10% FBS) for further staining. 


Instrument Flow cytometry analyses were carried out on a BD LSR-Fortessa (BD Biosciences, USA). The majority of cell-sorting experiments 
were carried out ona BD Influx cell sorter (BD Biosciences, USA), with the exception of the ex vivo lineage tracing experiments 
which were performed on the MOFLO system (Beckman Coulter). 

Software FlowJo 10.4.2 (FlowJO, LCC 2006-2018, USA) was used for analysis. 

Cell population abundance Purity check was routinely performed after each sorting. Cells were used when purity was above 85% 


Gating strategy All gating strategy are described in the methods and two typical examples are provided. 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 
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Identification of an ATP-sensitive 
potassium channel in mitochondria 


Angela Paggio*, Vanessa Checchetto?*, Antonio Campo!, Roberta Menabo%, Giulia Di Marco!, Fabio Di Lisa!-, Ildiko Szabo, 


Rosario Rizzuto!* & Diego De Stefani!* 


Mitochondria provide chemical energy for endoergonic reactions in the form of ATP, and their activity must meet cellular 
energy requirements, but the mechanisms that link organelle performance to ATP levels are poorly understood. Here we 
confirm the existence of a protein complex localized in mitochondria that mediates ATP-dependent potassium currents 
(that is, mitoK arp). We show that —similar to their plasma membrane counterparts—mitoK a7p channels are composed 
of pore-forming and ATP-binding subunits, which we term MITOK and MITOSUR, respectively. In vitro reconstitution 
of MITOK together with MITOSUR recapitulates the main properties of mitoK arp. Overexpression of MITOK triggers 
marked organelle swelling, whereas the genetic ablation of this subunit causes instability in the mitochondrial membrane 
potential, widening of the intracristal space and decreased oxidative phosphorylation. In a mouse model, the loss of 
MITOK suppresses the cardioprotection that is elicited by pharmacological preconditioning induced by diazoxide. Our 
results indicate that mitoK arp channels respond to the cellular energetic status by regulating organelle volume and 
function, and thereby have a key role in mitochondrial physiology and potential effects on several pathological processes. 


ATP-sensitive potassium (Karp) channels act as sensors of cellular 
metabolism. In the plasma membrane’, they couple cell excitability 
with energy availability”’. They have also been reported to be located 
in intracellular membranes—for example, in mitochondria (that is, 
mitoK,rp)*°—but, in this context, their existence is a matter of debate®. 
MitoKarp mediates the electrophoretic uptake of potassium ions (K*), 
which is driven by the negative mitochondrial membrane potential 
(Av,,), and it is inhibited by physiological levels of ATP. MitoK rp was 
first described in the early 1990s, through patch clamp of mitoplasts* 
or by partial purification techniques*. MitoK,rp has been character- 
ized from a pharmacological point of view, and both openers (diazox- 
ide) and inhibitors (sulfonylureas and 5-hydroxydecanoate (5-HD)) 
have been described—some of them with proposed specific action on 
mitoKarp versus plasma membrane Karp channels’. Drugs that tar- 
get Karp channels are useful in the treatment of several pathologies. 
Importantly, some of their uses are due to the modulation of plasma 
membrane K,rp* but others seem to depend on their effects on mito- 
Karp. For example, pharmacological preconditioning with diazoxide 
efficiently protects the heart from ischaemia-reperfusion injury®"” 
even in the absence of cardiac plasma membrane Kayp!>!”. However, 
the molecular identity and pharmacology of the mitoKarp channel 
remain unknown”!*"*, 


MITOK is a cation channel 

We screened a subset of mitochondrial proteins with unknown func- 
tion and focused on a candidate protein that is encoded by the CCDC51 
gene (NCBI code 79714; we hereafter name the CCDC51 gene MITOK), 
the overexpression of which markedly impaired mitochondrial physiol- 
ogy. The MITOK gene is conserved in vertebrates, in which it encodes a 
unique 45-kDa protein with a predicted N-terminal mitochondrial tar- 
geting sequence, one coiled-coil and two transmembrane domains. In 
humans, the MITOK gene encodes for two isoforms: isoform 1, which 
is full-length but has no predicted mitochondrial targeting sequence, 
and isoform 2, which is a splice variant that lacks the first 109 amino 


acids (34 kDa in size) and includes a supposed mitochondrial targeting 
sequence (Fig. la). Analyses of RNA and protein levels using existing 
datasets revealed that MITOK is expressed in all tissues in humans, 
as is Ccdc51 (which we hereafter name Mitok) in mice!>!®. First, we 
experimentally validated the mitochondrial localization of MITOK in 
humans and mice. Immunofluorescence showed a full co-localization 
of MITOK with a mitochondrial marker in HeLa cells (Fig. 1b), and 
subcellular fractionation of mouse liver revealed a progressive enrich- 
ment of MITOK in mitochondria and mitoplasts, and the absence of 
this protein on the outer membrane (Fig. 1c). Carbonate extraction 
confirmed membrane insertion (Extended Data Fig. 1a), which indi- 
cates that MITOK is in the inner mitochondrial membrane. To inves- 
tigate the topology of MITOK, we used two antibodies—one against 
the N-terminal half, and the other covering the C-terminal half, of 
MITOK (Fig. 1d). Digestion of mouse mitoplasts using proteinase K 
caused the loss of full-length MITOK and the appearance of a smaller 
fragment that was recognized by the N-terminal antibody (Extended 
Data Fig. 1b), which indicates that a portion of the protein is protected 
inside the organelle. By contrast, no residual signal was detected with an 
antibody against the region between the two transmembrane domains, 
which indicates that this part is exposed to the intermembrane space. 
MITOK is a two-pass protein of the inner mitochondrial membrane 
with the N and C termini exposed towards the matrix. Next, we cloned 
and tagged (with Flag, V5 and GFP) mouse MITOK and investigated 
the effect of its overexpression. In terms of morphology, MITOK over- 
expression causes organelle fragmentation (Extended Data Fig. 1c) and 
swelling (Fig. le). At functional level, MITOK overexpression caused 
a drop in AW, (Extended Data Fig. 1d) and agonist-induced mito- 
chondrial Ca?* uptake (an additional readout for changes in AY) 
(Extended Data Fig. le). For the human isoforms, we designed both 
specific (isoform 1 versus isoform 2) and non-specific (pan-isoforms) 
primers for quantitative PCR. Both isoforms can be detected in HeLa 
cells at the transcript level, although isoform 2 is expressed one tenth 
the level of isoform 1 (Extended Data Fig. 1f). Despite this, only a 


1Department of Biomedical Sciences, University of Padova, Padova, Italy. Department of Biology, University of Padova, Padova, Italy. 3CNR Institute of Neuroscience, Padova, Italy. “These authors 
contributed equally: Angela Paggio, Vanessa Checchetto. *e-mail: rosario.rizzuto@unipd.it; diego.destefani@gmail.com 
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Fig. 1 | Biochemical and functional characterization of MITOK. 

a, Representation of human (hs) and mouse (mm) MITOK proteins. 
Transmembrane (TM) and coiled-coil (CC) domains are indicated. isol, 
isoform 1; iso2, isoform 2. b, Immunolocalization of MITOK (green) 

and the mitochondrial marker HSP60 (red). Representative of four 
independent experiments. Scale bar, 10 jum. c, Subcellular fractionation 
of mouse liver (replicated twice). IMS, inter-membrane space; OMM, 
outer mitochondrial membrane. d, Representation of MITOK membrane 
topology. C-term, C terminus; IMM, inner mitochondrial membrane; 
N-term, N terminus; anti- MITOK, antibody raised against the indicated 
region of MITOK. e, Transmission electron microscopy images of control 
and MITOK-overexpressing HeLa cells (replicated three times). 

f, g, Representative current traces with MITOK purified from E. coli 

(f, n = 5 biological replicates from 2 independent preparations) or expressed 
in vitro (g, n = 23 biological replicates from 10 independent preparations). 


single band can be detected in western blots using HeLa cells (Extended 
Data Fig. 1g). In most human tissues, isoform 1 shows moderate levels 
of expression, whereas isoform 2 is barely detectable. This pattern is 
reversed in the case of the spleen, which shows substantial expression 
of isoform 2 and marginal expression of isoform 1 (Extended Data 
Fig. 1f). In terms of their roles, isoform 1 localizes to mitochondria 
and—similar to mouse MITOK—its overexpression induced morpho- 
logical and functional organelle impairment (Extended Data Fig. 2a, b), 
which indicates that both mouse and human MITOK genes encode 
similar proteins (although some human cells express a shorter and 
less-active splicing variant). Overall, overexpression of MITOK causes a 
severe perturbation of mitochondrial structure and function. Although 
several mechanisms could account for these effects, we reasoned that 
MITOK could act as a cation channel, because valinomycin (a K* ion- 
ophore, a molecule that mediates K* influx into the matrix) closely 
mimics the phenotype observed on MITOK overexpression. 

The unambiguous demonstration of channel activity necessarily 
requires a simplified reconstitution approach that uses recombinant 
proteins. We thus measured the channel activity of mouse MITOK in 
the planar lipid bilayer using MITOK from two systems (Escherichia 
coli and the wheat-germ cell-free transcription and translation tool), 
both of which express MITOK at high levels (Extended Data Fig. 3a, b). 
We observed channel activity in a medium that contained only Kt 
as cation (Fig. 1g—h). Burst-like, flickering activity and cooperative 
transitions between dual- or multi-states were observed*!” (Extended 
Data Fig. 3c, d). The channel showed an ohmic behaviour (Extended 
Data Fig. 3e), was voltage-independent (Extended Data Fig. 3f) and 
was selective for Kt over chloride (Px:Pc = 1:0.02) (Extended Data 
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Fig. 2 | Electrophysiological characterization of recombinant MITOK 
co-expressed with MITOSUR. a, Current traces before (control) and 
after the first addition of 500 1M Mg and ATP, the second addition of 
500 .M Mg and ATP and the third addition of 80 |1M diazoxide. All 
traces were obtained from the same experiment, representative of four 
independent experiments. b, Current recordings before and after addition 
of 30 1M glibenclamide. The channel was re-activated by subsequent 
addition of 100 1M diazoxide (n = 4 for inhibition by glibenclamide, 

n = 2 for reactivation by diazoxide). c, Representative histograms before 
(top) and after (bottom) addition of 5-HD (100 1M, n = 5 independent 
experiments). 


Fig. 3g). Channel conductance was 57 + 11 pS in both 100 mM KCl 
and K-gluconate media'®’° (Fig. 1h, Extended Data Fig. 3e). Channel 
activity could be blocked by addition of barium, an inhibitor of KT 
channels (Extended Data Fig. 3h), but not by paxilline, an inhibitor of 
the BKCa channels (Extended Data Fig. 3i). 


MITOK and MITOSUR form the mitoK arp channel 

On the basis of these data, we pursued the hypothesis that MITOK 
could be mitoKyrp—despite the fact that (i), similar to the lysosomal K* 
channel TMEM1752", MITOK does not contain the typical K* selectiv- 
ity filter (and, accordingly, allows permeation of Na‘) (Extended Data 
Fig. 4a) and (ii) the purified protein per se did not respond to ATP 
(Extended Data Fig. 4b) or 5-HD (Extended Data Fig. 4c). However, 
we reasoned that ATP sensitivity could be conferred by a regulatory 
sulfonylurea-receptor (SUR)-like subunit. Ten ATP-binding cassette 
(ABC) proteins can be detected in mitochondria'*, most of which belong 
to ABCB subfamily”!. We focused on ABCB8 (which we hereafter 
name MITOSUR) because (i), of the ten ABC proteins in mitochondria, 
the tissue expression of this protein best correlates with MITOK, and (ii) 
it has previously been suggested to be part of mitoKyrp””. We expressed 
in vitro mouse MITOK together with MITOSUR; these were folded 
and incorporated into liposomes, as indicated by thermal stability 
assay (Extended Data Fig. 5a—c) and membrane extraction (Extended 
Data Fig. 5d), with a membrane orientation that resembled that in 
mitoplasts (Extended Data Figs. 5e, f, 6a). MITOK and MITOSUR 
were able to form a Kt permeable channel (Fig. 2a, Extended Data 
Fig. 5g-i) that was (i) inhibited by millimolar concentrations of 
ATP, (ii) activated by diazoxide (Fig. 2a, Extended Data Fig. 5g, h) 
and (iii) blocked by both the sulfonylurea glibenclamide (Fig. 2b) and 
5-HD (Fig. 2c). The channel conductance slightly decreased upon addi- 
tion of 1 mM Mg? (Extended Data Fig. 5j). Activity was observed in 
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Fig. 3 | MITOK and MITOSUR form the mitoK rp channel in situ. 

a, Co-immunoprecipitation (co-IP) between overexpressed MITOK and 
MITOSUR (representative of three independent experiments). b, Blue- 
native PAGE of digitonin-permeabilized mitochondria. Representative of 
two independent experiments. c, Co-immunoprecipitation of endogenous 
MITOK using liver mitochondria. Representative of two independent 
experiments. d, AY, measurements in HeLa cells transfected with the 
indicated constructs. n > 9 biological replicates from 3 independent 
experiments, *P < 0.01 using two-way analysis of variance (ANOVA) 

with Holm-Sidak correction NS, not significant. AU, arbitrary units; 
TMRM, tetramethylrhodamine methy] ester. e, Measurements of Ca?+ 
concentration in mitochondria ([Ca?*] mt) (mean + s.d.) in HeLa cells that 
express the indicated constructs; n = 8 biological replicates (representative 
of 3 independent experiments), *P < 0.001 using two-way ANOVA with 
Holm-Sidak correction. f, Current traces (left) and histograms (right) of 
MITOK together with MITOSUR(K513A), before and after the addition of 
2mM Mg and ATP. Representative of three independent preparations. 


the absence of divalent cations (Extended Data Fig. 5k) and the pres- 
ence of Na* only (Extended Data Fig. 51). Overall, all these features 
recapitulate the fundamental electrophysiological properties and the 
consensus pharmacological profile of mitoKarp'®’, with MITOK 
forming the K*-permeant channel and MITOSUR acting as a modu- 
latory subunit that carries the ATP-binding site. We next investigated 
the membrane topology of MITOSUR by performing a protease pro- 
tection assay using an antibody that covers the ATP-binding region 
(amino acids 394-693). Extended Data Figure 6a shows that proteinase 
K leads to the loss of the MITOSUR-specific band, which indicates 
that the ATP-binding cassette is exposed to the intermembrane space. 
Overall, the membrane orientations of both MITOK and MITOSUR 
are supported by previous bioenergetics studies” and independent pro- 
teomic approaches”. 

In light of this architecture, we reasoned that the unregulated K* 
uptake in cells that overexpress MITOK alone must be the cause of 
organelle impairment (Fig. 1). If this is true, the combined overex- 
pression of human MITOSUR and mouse MITOK should reverse the 
mitochondrial dysfunction. To test this, we first verified the physical 
interaction between human MITOSUR and mouse MITOK through 
co-immunoprecipitation (Fig. 3a). In addition, immunoblot analysis 
of digitonin-solubilized mitochondrial complexes in native conditions 
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revealed an approximately 500-kDa band that reacted with both 
MITOK and MITOSUR antibodies (Fig. 3b)—a size that is compati- 
ble with an octamer (four MITOK and four MITOSUR components). 
Accordingly, immunoprecipitation of endogenous MITOK could 
efficiently pull down MITOSUR using both mouse and human mito- 
chondria (Fig. 3c, Extended Data Fig. 6b). In terms of function, the 
overexpression of mouse MITOK alone caused a decrease of both mito- 
chondrial membrane potential and Ca?+ uptake (Fig. 3d, e), whereas 
the overexpression of the MITOSUR subunit alone did not affect these 
parameters. Most importantly, the combined overexpression of the two 
subunits fully rescued AW, and Ca’* dynamics (which were impaired 
by mouse MITOK alone), suggesting the recovery of the proper channel 
gating. We also generated a MITOSUR mutant (MITOSUR(K513A)) 
that is unable to bind ATP. This mutant could interact with mouse 
MITOK (Extended Data Fig. 6c)—but did not respond to ATP in elec- 
trophysiological experiments (Fig. 3f) and did not rescue the loss of 
mitochondrial membrane potential and Ca* accumulation that is 
caused by the overexpression of mouse MITOK (Fig. 3d, e). This pro- 
vides further confirmation that ATP acts as channel inhibitor. Overall, 
our data indicate that MITOK and MITOSUR form a complex that 
is responsible for the ATP-sensitive mitochondrial K~ transport both 
in vitro and in situ. 


MITOK controls mitochondrial volume 

Despite consensus regarding the cytoprotective role of mitoKarp 
opening in stress conditions (which is mainly based on pharmacolog- 
ical studies)'°, the constitutive physiological function of this channel 
remains obscure. We therefore generated HeLa cells that are knock- 
out for MITOK by CRISPR-Cas9 DNA cleavage, using two different 
guides (Extended Data Fig. 7a, b). We first confirmed that MITOK is 
required for ATP-dependent K~ fluxes in mitochondria, by measur- 
ing mitochondrial swelling rates in a K*-based buffer®*. In isolated 
wild-type mitochondria, ATP decreases, and diazoxide increases, the 
swelling rates (through the inhibition and activation of the mitoK arp 
channel, respectively) (Fig. 4a). Accordingly, mitochondria isolated 
from MITOK-knockout cell lines swell at a constant rate, independently 
of either ATP or diazoxide (Fig. 4b). 

MITOK-knockout cells were viable and showed a highly intercon- 
nected mitochondrial network by optical microscopy. Although the 
gross morphology appeared similar, ablation of MITOK led to the 
appearance of several doughnut-shaped (toroidal or ring-like) mito- 
chondria (Extended Data Fig. 7c), a phenotype that is associated with 
impaired organelle Kt homeostasis”°. AY, was intact, but HeLa cells 
knockout for MITOK undergo asynchronous, rapid and transient 
depolarizations of single mitochondria (Extended Data Fig. 7d, e), 
a phenomenon known as mitochondrial ‘flickering’ or ‘flashes’’*-*?. 
Importantly, this phenotype is specific, as shown by the fact that the 
reintroduction of human MITOSUR and mouse MITOK restored AY, 
stability (Fig. 4c). In terms of oxidative performance, the ablation of 
MITOK caused a decrease in the basal and maximal oxygen consump- 
tion rates, despite the similar levels of expression of components of the 
electron transport chain (Fig. 4d, Extended Data Fig. 7f). This could be 
partially rescued by (i) pharmacological treatment with a minimal dose 
of valinomycin (which has no effect in control cells) (Extended Data 
Fig. 8a) and (ii) the reintroduction of human MITOSUR and mouse 
MITOK (Extended Data Fig. 8b). To understand the causes of altered 
organelle function, we investigated mitochondrial ultrastructure by 
transmission electron microscopy. Although the gross mitochondrial 
morphology was preserved, MITOK-knockout cells show enlarged 
cristae (Fig. 4e), which is consistent with the effect of the inhibition 
of another inner mitochondrial membrane K* channel*°. Normal 
morphology of cristae was readily restored by the re-expression of the 
mitoKarp channel (Fig. 4e). Given that K* fluxes across the inner mito- 
chondrial membrane are the main determinants of the water content of 
organelles*', we speculated that cristae remodelling could be due to a 
dysregulation of matrix volume (as suggested by swelling experiments) 
(Fig. 4a, b). The inner mitochondrial membrane rapidly rearranges in 
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response to osmotic changes: matrix contraction leads to expanded marginally increase when mitoKayp is absent, thus indicating that mito- 
cristae and matrix swelling causes the collapse of the intracristae com- chondria K* homeostasis impinges on the regulation of redox balance 
partment. Genetic ablation of MITOK caused both a widening of the during metabolic stress. Overall, our data indicate that the mitoK,rp 
intracristae space (Extended Data Fig. 8c) and a lower oligomerization channel regulates mitochondrial adaptations to cellular stress, possibly 
of OPA] (an additional biochemical readout for cristae remodelling, for through the regulation of matrix volume (Fig. 4g). In addition, as pre- 
which higher multimerization correlates with tighter cristae*”). Even _ viously suggested*’, the loss of MITOK increases cell death triggered 
in these cases, valinomycin partially recovered normal morphology of _ by oxidative stress (Extended Data Fig. 8h), which is consistent with 
the cristae (Extended Data Fig. 8c, d). cristae widening*, 
We then considered how the mitoK arp channel affects organelle 

adaptations to energy stress. We treated wild-type and MITOK- MITOK is required for pharmacological preconditioning 
knockout cells with the glycolysis inhibitor 2-deoxyglucose, which _ Finally, we generated Mitok-knockout mice through the specific deletion 
rapidly decreases global cellular metabolism (Extended Data Fig. 8e,f). of exon 4, which contains most of the coding sequence. Overall, these 
First, we tested how the mitochondrial morphology changes in mice show no overt phenotype (being born at the expected Mendelian 
response to ATP depletion. Wild-type HeLa cells rapidly underwent ratio, with a similar aspect and weight gain) until at least four months 
fragmentation of the mitochondrial network (Fig. 4f). By contrast, met- of age. To demonstrate the lack of mitoKarp activity, we measured 
abolic inhibition in MITOK-knockout cells caused no evident change _ organelle K* fluxes using 8°Rb* as surrogate!®. Energized mitochon- 
of the overall mitochondrial morphology (Fig. 4f), which indicates that dria isolated from wild-type livers showed ATP- and diazoxide- 
mitochondrial morphology adapts promptly to the energetic state of _ sensitive Kt uptake (Fig. 5a). By contrast, neither ATP nor diazoxide 
the cells through a mitoK,rp-dependent mechanism. Then, we moni-_ were able to alter K* fluxes when MITOK was absent (Fig. 5b, c). 
tored the production of reactive oxygen species (ROS) during metabolic _ Finally, we performed ex vivo ischaemia-reperfusion experiments in 
stress and/or pharmacological modulation of the mitoK,rp channel. _ wild-type and Mitok-knockout mice and evaluated the cardioprotec- 
Extended Data Figure 8g indicates that (i) loss of MITOK increases _ tive effect triggered by pharmacological preconditioning induced by 
ROS production, notwithstanding the decreased oxygen consumption diazoxide. As shown in Fig. 5d, the hearts of untreated Mitok-knockout 
rate (which provides further support for the idea that this represents _ mice are slightly more sensitive to the ischaemia—reperfusion protocol, 
latent mitochondrial dysfunction); (ii) diazoxide increases ROS in which provides further confirmation of the cytoprotective role of 
wild-type but not in MITOK-knockout cells (which supports the idea MITOK. As previously shown'®!?3>3°, pharmacological precondi- 
of the mitoK arp channel as a regulator of redox state); (iii) metabolic tioning with diazoxide efficiently protects the heart from reperfu- 
stress can increase ROS production in control cells; and (iv) ROS levels sion damage. Most importantly, the effects of this pharmacological 
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Fig. 5 | MITOK is required for diazoxide-induced cardioprotection. 
a, Rb? flux in isolated mitochondria. n = 3 independent experiments. 
b, ®°Rbt uptake rate. n = 3 independent experiments, *P < 0.029 using 
two-tailed Student's t-test. c, Western blot of wild-type and MITOK- 
knockout liver mitochondria. d, e, Heart injury after ischaemia—reperfusion 
(I/R), evaluated as percentage of lactate dehydrogenase (LDH) release 

(d, mean + s.d., n > 5 independent mice, *P < 0.001 using two-way 
ANOVA with Holm-Sidak correction) or percentage of infarct area after 
staining with 2,3,5-triphenyltetrazolium chloride (TTC) (e, mean + s.d., 
n > 7 independent mice, *P = 0.008). Single measurements are provided 
in Source Data. 


preconditioning are nearly lost in the hearts of Mitok-knockout mice 
(as also shown by analysis of infarct size; Fig. 5e), which demonstrates 
that mitoK rp is the molecular target of diazoxide—at least in this 
pathological setting. 

In conclusion, we have identified a protein complex that accounts for 
ATP-sensitive Kt transport across the inner mitochondrial membrane, 
composed of a channel-forming subunit (MITOK) and a regulatory 
subunit that carries the ATP-binding domain (MITOSUR). Although 
we do not exclude the possibility that other proteins could perform 
similar activities in specific tissues®’, the in vitro reconstitution of this 
complex is sufficient to reliably recapitulate the main electrophysiologi- 
cal properties and pharmacological profile of the long-sought mitoK rp 
complex’, The mitoK rp complex that we identify represents a poten- 
tial mechanism for matching ATP availability to energy production, 
and thus contributing to the homeostatic control of cellular metabolism 
under stress conditions. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and investigators were not blinded to allocation during 
experiments and outcome assessment. 

Chemicals, cell culture and transfection. All chemicals were purchased from 
Sigma-Aldrich, unless otherwise specified. All the experiments were performed 
in HeLa cells (ATCC number CCL-2) cultured in Dulbecco’s modified Eagle’s 
medium (DMEM) (Gibco no. 41966052, Thermo Fisher Scientific), supplemented 
with 10% fetal bovine serum (FBS) (Thermo Fisher Scientific), containing penicil- 
lin (100 U/ml) and streptomycin (100 j1g/ml) (Euroclone). When needed, cells were 
seeded onto 13- or 24-mm glass coverslips and allowed grow to 50% confluence 
before transfection. Transfection was performed with a standard Ca”*-phosphate 
procedure. 

Constructs. Mouse Mitok coding sequence (NCBI code NM_025689) was ampli- 
fied from mouse skeletal-muscle cDNA library by PCR using the following primers. 

For the cloning of Mitok-gfp: forward (fw), 5‘-CTCGAGATGACAGGGT 
GCAGCCCCGT-3’; reverse (rv), 5‘-GGATCCCGACTGGTCTTGAACAGCA 
TGT-3’. The PCR fragment was cloned into pEGFP-N1 (Clontech) using XhoI 
and BamHI sites. 

For the cloning of Mitok-Flag into pcDNA3.1: fw, 5‘- AAGCTTATGACAGGGT 
GCAGCCCCGT-3’; rv, 5’-CTCGAGTTACTTATCGTCGTCATCCTTG 
TAATCACTGGTCTTGAACAGCATGT-3’. The PCR fragment was cloned into 
pcDNA3.1 (Thermo Fisher Scientific) using HindIII and Xhol sites. 

For the cloning of Mitok-V5 into pcDNA3.1: fw, 5’-AAGCTTATGACAGGGTGC 
AGCCCCGT-3’; rv, 5‘-CTCGAGTTACGTAGAATCGAGGAGACCG 
AGAGGGTTAGGGATAGGCTTACCACTGGTCTTGAACAGCATGT-3’. The 
PCR fragment was cloned into pcDNA3.1 using HindIII and Xhol sites. 

For the cloning of Mitok-6 x His in plVEX1.3WG: fw, 5/-CCATGGCAACAGGGT 
GCAGCCCCGTGTT-3’; rv, 5’- CTCGAGACTGGTCTTGAACAGCATGT-3’. 
The PCR fragment was cloned into plVEX1.3WG (Roche) using Ncol and XhoI 
sites. 

For the cloning of Mitok-6 x His in pET-28A (+): fw, 5‘-AAGCTTGCATGACAGG 
GTGCAGCCCCGT-3'; rv, 5‘-CTCGAGTTAACTGGTCTTGAACAGCA-3’. The 
PCR fragment was cloned into pET-28A(+-) (Novagen) using HindIII and Xhol sites. 

Human MITOK coding sequences (NCBI codes NM_001256964 and 
NM_001256965) were amplified from human spleen cDNA library by PCR using 
the following primers. 

For the cloning of MITOK isoform 1 (NM_001256964) into pcDNA3.1: 
fw, 5‘- GGATCCATCTCAGGATGATGGGGC-3'; rv, 5‘-GAATTCTTAGCTGGCT 
TTGAATAGCATGTAGAG-3’. The PCR fragment was cloned into pcDNA3.1 
using BamHI and EcoRI sites. 

For the cloning of MITOK isoform 1 tagged with haemagglutinin 
(HA) into pcDNA3.1: fw, 5’-GGATCCATCTCAGGATGATGGGGC-3’; 
rv, 5'-GAATTCTTAAGCGTAATCTGGAACATCGTATGGGTAGC 
TGGCTTTGAATAGCATGTAGAG-3’. The PCR fragment was cloned into 
pcDNA3.1 using BamHI and EcoRI sites. 

For the cloning of MITOK isoform 2 (NM_001256965) into pcDNA3.1: fw, 
5'-GGATCCGCCACCATGGTGGCTCGAGGGCTTG-3’; rv, 5’ GAATTC 
TTAGCTGGCTTTGAATAGCATGTAGAG-3’. The PCR fragment was cloned 
into pcDNA3.1 using BamHI and EcoRI sites. 

For the cloning of MITOK isoform 2 tagged with HA into pcDNA3.1: fw, 
5'-GGATCCGCCACCATGGTGGCTCGAGGGCTTG-3’; rv, 5/-GAATTCTT 
AAGCGTAATCTGGAACATCGTATGGGTAGCTGGCTTTGAATAGCATGT 
AGAG-3’. The PCR fragment was cloned into pcDNA3.1 using BamHI and EcoRI 
sites. 

MITOSUR expression plasmid (pCMV6-ABCB8-myc-Flag) containing the 
NM_007188 reference sequence (and corresponding to ABCB8 transcript variant 
2) was purchased from Origene (cat no RC224948). 

For the cloning of MITOSUR in pIVEX1.3WGic fw, 5‘-GCGATCGCCCATA 
TGCTGGTGCATTTA-3’; rv, 5’-CTACCGAGTACTTTAAACCTTATC-3’. The 
PCR fragment was cloned into pIlVEX1.3WG using Ndel and Scal sites. 

The generation of the MITOSUR"™4 mutant was performed by mutagenesis PCR 
using the wild-type MITOSUR-encoding vector as template and the mutagenesis 
primer: 5’- GGCCAGTCTGGCGGAGGAGCGACCACCGTGGCTTCCCTG-3’. 
The amino acid numbering refers to ABCB8 isoform 1 (Uniprot code Q9NUT2). 

For the generation of the construct MITOSUR-Myc-P2A-Mitok-Flag 
in pcDNA3.1, MITOSUR-Myc was amplified with following primers: fw, 
5'-GGATCCATGCTGGTGCATTTATTTCG-3’; rv, 5’-GAATTCCGGTCCAG 
GATTCTCTTCGACATCTCCGGCTTGTTTCAGCAGAGAGAAGTTTGTTG 
CCAGATCCTCTTCTGAGATGAGTTTCTGCTCGGACTTGTGCTGGTGGC 
TCC-3’. The PCR fragment was cloned into pcDNA3.1 using BamHI and EcoRI 
sites. 

Mitok-Flag was amplified with the following primers: fw, 5’-GAAT 
TCATGACAGGGTGCAGCCCCGT-3’; rv, 5‘-GCGGCCGCTTACTTAT 


CGTCGTCATCCTTGTAATCACTGGTCTTGAACAGCATGT-3’. The PCR 
fragment was cloned into the above-mentioned plasmid using EcoRI and Not! sites. 

For the generation of mitochondrial-targeted mEmerald, mEmerald was 
amplified from the mEmerald-Mito-7 plasmid (Addgene plasmid no. 54160), a 
kind gift of M. Davidson, with the following primers: fw, 5’-AAGCTTGTGAG 
CAAGGGCGAGG-3'; rv, 5/-GAATTCTTACTTGTACAGCTCGTCCATG-3’. 
The PCR fragment was cloned in a custom pcDNA3.1 plasmid containing four 
repeated mitochondrial targeting signals from human COX8A (pcDNA3.1-4mt) 
using HindIII and EcoRI sites. 

All constructs were verified by Sanger sequencing. 
RNA extraction, reverse transcription and quantitative real-time PCR. For 
quantitative (q)PCR analyses, HeLa cells were lysed in an appropriate volume 
of Trizol reagent (Thermo Fisher Scientific). For human tissues, a commercial 
mRNA library was used (Clontech Human Total RNA Master Panel II, cat no. 
636643). The RNA was quantified with a NanoDrop (Thermo Fisher Scientific). 
Complementary DNA was generated with a cDNA synthesis kit (SuperScript 
II, Thermo Fisher Scientific) using the oligo(dT)12-18 primer (Thermo Fisher 
Scientific) and analysed by real-time PCR using the SYBR green chemistry 
(Thermo Fisher Scientific). Primers were designed using Primer-BLAST*. Real- 
time PCR standard curves were constructed by using serial dilutions of cDNA 
of the analysed samples, using at least 4 dilution points and the efficiency of all 
primer sets was between 80 and 120%. The housekeeping gene ACTIN was used 
as an internal control for cDNA quantification and normalization of the ampli- 
fied products. All data are reported as mean + s.d., from n = 3 experiments. In 
the case of HeLa cells, three independent RNA extractions and reverse transcrip- 
tion reactions were used. In the case of human tissues, three technical replicates 
were used. qPCR primer sequences were as follows. For MITOK both isoforms, 
fw GGATGCTGCAGGAGGAGAAG, and rv CTTGGTCCTCTCAGCCCTTG; 
for MITOK isoform 1, fw CGGAACCGTAGGAGGGGTACT, and rv CTCCG 
AACCAGTACGTGGGG; for MITOK isoform 2, fw CGGTTTTCTCTTTG 
CAGGCT, and rv TCTTGGTCCTCTCAGCCCTT; and for ACTB, fw CCTTTTATG 
GCTCGAGCGGC, and rv CATCATCCATGGTGAGCTGGC 

The isoform-specific primers are not very efficient, as compared to non-specific 
ones (Extended Data Fig. 1d)—this is most probably due the fact that 5’ untrans- 
lated region region is under-represented when using oligo-dT for reverse transcrip- 
tion (all primers are, however, specific, as they do not detect substantial levels of 
transcript in HeLa cells that are knockout for MITOK). 
Expression and purification of MITOK and of MITOSUR. C41(DE3) E. coli cells 
were transformed with the plasmid expressing mouse MITOK. Expression was 
achieved as previously described’. Five hours after induction with IPTG, cells were 
collected and sonicated in 250 mM NaCl and 25 mM TRIS, pH 8.0, with 1 j.g/ml 
leupeptine and pepstatine. The samples were subsequently centrifuged 15000g for 
30 min at 4°C to separate the membrane fraction (pellet) from the soluble fraction 
(supernatant). Then, the pellet was solubilized in 2.5% decyl-8-p-maltopyranoside 
(Sigma-Aldrich) in the sonication buffer for 3 h, and the resulting soluble material 
was loaded onto Ni resin (HIS-Select Nickel Affinity Gel, Sigma-Aldrich). After 3 
washes in equilibration buffer (50 mM sodium phosphate, pH 8.0, 300 mM sodium 
chloride), MITOK was eluted with a 250 mM imidazole solution in equilibration 
buffer. All fractions were collected and tested using standard SDS-PAGE and west- 
ern blot analyses. Immuno-detection of the expressed channel was performed 
using anti-6 x His tag antibody (Sigma-Aldrich) and anti-MITOK antibody. For 
in vitro expression and electrophysiology, a previously described protocol*® was 
used. In brief, human MITOSUR and mouse MITOK proteins were expressed 
either separately or together in an in vitro wheat (Triticum aestivum) germ lysate 
system based on the continuous exchange cell-free technique, using the Wheat 
Germ CECF Kit (Biotechrabbit). Synthesis was achieved for 24 h at 24°C under 
continuous mixing on a Thermomixer comfort unit (Eppendorf). After expres- 
sion, the reaction mixture was either loaded on a Ni-chromatography column or 
directly solubilized for 30 min with either Triton X-100 or digitonin (1% w/v). No 
differences were observed in channel activity depending on the detergent used. 
Following centrifugation, the supernatant containing the solubilized proteins was 
diluted 1:10 in 10 mM HEPES, pH 7.4. For co-expression experiments, 1:1 ratio 
of DNA was used. After expression, MITOK alone or the MITOK and MITOSUR 
reaction mix were solubilized with 1% Triton X-100 or digitonin, and incorporated 
into liposomes. Purified soybean asolectin was used to produce liposomes at 2mg/ml 
in 10mM HEPES, 10mM CaCh, pH 7.3. After solubilization, the reaction mix was 
incubated with liposomes for 15 minutes at room temperature. Liposomes were 
pelleted and suspended in the same volume and subjected to alkaline extraction 
by adding 1/10 volume of 2M NazCO; to check for insertion of the proteins into 
the liposomes (data not shown). Liposomes containing the proteins were frozen 
in small aliquots and used up to 24 hours after thawing. 
Electrophysiological recording of MITOK or MITOK and MITOSUR activi- 
ties in planar lipid bilayer and data analysis. A Warner Instruments (Hamden) 
BC-525C electrophysiological planar bilayer apparatus was used. Bilayers with a 


capacity of approximately 150 to 200 pF were prepared using partially purified (by 
precipitation with cold acetone from a chloroform solution) asolectin solution in 
decane:chloroform with a 100:1 ratio per mg of lipids (Sigma) across a 250-1m hole 
in a polystyrene cuvette. The contents of both chambers were stirred by magnetic 
bars when necessary. Voltages reported are those of the cis chamber, and current 
is considered positive when carried by cations flowing from the cis compartment 
to the trans compartment. Three ml of recording solution were added to both 
compartments. 100 mM K-gluconate, 10 mM HEPES/KOH, pH 7.4 or 100 mM 
KCl, 10 mM HEPES/KOH pH 7.4 medium ([K*] 117 mM) were used unless oth- 
erwise specified. Lack of activity in the membrane before addition of the protein 
was monitored for at least 5 min in each experiment and long-lasting (>30 min) 
control experiments showed no activity without addition of the protein (n = 50) 
or following addition of the detergent only (at the same concentration that is added 
with the proteins, n = 15). Ten microlitres of the solubilized and diluted (1:10, see 
‘Expression and purification of MITOK and of MITOSUR) MITOK or MITOK 
and MITOSUR mixture incorporated into liposomes were added to the cis side. 
The final detergent concentration was between 0.0003% and 0.0006%. Channel 
modulators were added to both compartments to the required concentration for 
inhibition or activation. Electrical connections were made using Ag and AgCl 
electrodes and agar salt bridges to minimize liquid junction potentials. The current 
was digitized at a sampling rate of 10 kHz, the signals were filtered at 500 Hz and 
the data were analysed offline using pCLAMP8.0 (Molecular Devices). The channel 
recordings illustrated are representative of the most-frequently observed ampli- 
tudes of the opening channels under a given condition. The conductance values 
were calculated from the current-voltage relationship, averaged from at least three 
independent experiments. P(open) was calculated from segments of continuous 
recordings lasting 60 s. Amplitude histograms were obtained from >30-s gap-free 
current traces. The number of events in the amplitude histograms refers to the 
number of binned points at a given amplitude in the recordings. Px:Pc ratios were 
calculated from the measured reversal potentials using the Goldman-Hodgkin- 
Katz equation taking into account the effective [K*] (117 mM trans versus 287 mM 
cis). All analysis was performed without leak subtraction and current trace 
idealization. Analysis and fitting of data was performed using Origin 6.1 programs 
(for Gaussian and linear fitting). 

Thermal aggregation assay. The thermal aggregation profile of human MITOSUR 
and mouse MITOK proteins was been obtained by an adaptation of the cellular 
thermal shift assay to in vitro protein expression mixtures*!. In brief, WGL lysate 
was solubilized in 0.8% digitonin then treated for 30 min at room temperature 
under shaking with 1 mM ATP and 2 mM MgCl, to avoid the interference of 
any chaperone or folding helper components in the wheat-germ extract used for 
protein expression. WGL lysate was aliquoted into PCR tubes in equal volumes 
(40 11) and each sample was incubated at a different temperature, between 37°C 
and 95°C. A reference sample treated analogously was stored at 4°C and was used 
for densitometry-value normalization. Each sample was exposed to the designed 
temperature for 10 min through an Eppendorf 96-well thermal cycler, vortexed and 
centrifuged 30,000g for 25 min at 4°C to pellet aggregated and denatured proteins. 
Each supernatant containing the soluble proteins fraction was carefully removed 
and transferred into a new tube. The soluble fraction was analysed and quanti- 
fied for each temperature using western blotting technique, loading an equivalent 
volume of each sample into the wells of the gel, and blotted using the indicated 
antibodies. The samples for each thermal shift assay were run on the same gel. All 
experiments were performed on four independent occasions, and data are given 
as the average from these experiments. The solid lines represent the best fits of the 
data using a Boltzmann sigmoidal fitting within the GraphPad Prism software. 
Generation of MITOK-knockout cells. For the generation of MITOK- 
knockout cell lines, two Cas9 guides targeting different regions of the 
human MITOK gene were designed (GCCCCTCCGAACCAGTACGT and 
TCATGAGAAGGAGCGCACAA) using the MIT CRISPR design tool”, and 
cloned into the BsmB]I site of the pLentiCrisprV2 plasmid, a kind gift of F. Zhang 
(Addgene plasmid no. 52961)". Lentiviral particles were produced by transfecting 
293T cells with the transfer plasmids together with pRSV-Rev (Addgene plasmid 
no. 12253), pMDLg/pRRE (Addgene plasmid no. 12251) and pMD2.G (Addgene 
plasmid no. 12259) plasmids, kindly provided by D. Trono. Three days after trans- 
fection, the supernatants were collected, centrifuged and cleared through 0.45-j1m 
cellulose acetate filters. Target cells were infected with viral particles and selected 
with one microgram per millilitre puromycin for one week. Dilution cloning was 
performed to obtain different monoclonal cell populations that were screened and 
validated for MITOK gene ablation through western blot. 

Antibodies, SDS-PAGE and western blot. Cells were lysed in RIPA buffer (150 mM 
NaCl, 25 mM Tris-Cl pH 8, 1 mM EGTA-Tris, 1% Triton X-100, 0.5% sodium 
deoxycholate and 0.1% SDS) supplemented with complete EDTA-free protease 
inhibitor mixture (Roche Applied Science) and PhosStop (Roche Applied Science) 
for 30 min on ice. Crude extracts were centrifuged 15000g for 10 min to remove 
debris, and proteins in the supernatant were quantified using the BCA Protein 


ARTICLE 


Assay Kit (Pierce). Thirty micrograms of proteins were dissolved in LDS sample 
buffer (Life Technologies) supplemented with 100 mM dithiothreitol, heated for 
5 min at 90°C and loaded on 4-12% Bis-Tris NuPage gels (Thermo Fisher 
Scientific). After electrophoretic separation, proteins were transferred onto nitro- 
cellulose membranes by wet (Thermo Fisher Scientific) or semidry (BioRad) 
transfer. Membranes were blocked for 1 h at room temperature with 5% non-fat 
dry milk (BioRad) in TBS-T (50 mM Trizma, 150 mM NaCl and 0.1% Tween) and 
probed with the indicated primary antibodies over night at 4°C. Isotype-matched, 
horseradish-peroxidase-conjugated secondary antibodies (BioRad) were used, fol- 
lowed by detection by chemiluminescence (SuperSignal Pico, Pierce). The follow- 
ing primary antibodies were used: anti- MITOK¢ term (1:1,000, Sigma HPA011408), 
anti- MITOKy-term (1:10,000, Sigma HPA010980), anti- MCU (1:1,000, Sigma 
HPA016480), anti-MICU1 (1:1,000, Sigma HPA037480), anti- HSP60 (1:5,000, 
Santa Cruz sc-1052), anti-OPA1 (1:1,000, BD biosciences 612606), anti- MITOSUR 
(1:1,000, Abcam ab182662), anti-OXPHOS (1:1,000, Abcam ab110413), 
anti-TOM20 (1:10,000, Santa Cruz sc-11415) and anti-Flag (1:1,000, Cell Signaling 
no. 2368). Western blots are representative of at least three independent prepara- 
tions. Uncropped images of western blots used for the assembly of final figures are 
provided in Supplementary Fig. 1. 

Blue Native PAGE. Mitochondrial fractions were lysed in the appropriate vol- 
ume of NativePAGE Sample Buffer (Thermo Fisher Scientific) supplemented with 
3% (w/w) digitonin. Crude extracts were centrifuged at 15000g for 10 minutes to 
remove debris, and proteins in the supernatant were quantified using the BCA 
Protein Assay Kit (Pierce). One hundred micrograms of proteins were dissolved 
in NativePAGE sample buffer supplemented with Coomassie G-250 (Thermo 
Fisher Scientific) and loaded on a 4-16% Novex NativePAGE Bis-Tris Gel System 
(Thermo Fisher Scientific). After electrophoretic separation, proteins were 
transferred onto PVDF membranes and probed with the indicated antibodies. 
As molecular mass marker, NativeMark Unstained Protein Standard (Thermo 
Fisher Scientific) was used and stained with Colloidal Blue Staining Kit (Thermo 
Fisher Scientific). Isotype-matched, horseradish-peroxidase-conjugated second- 
ary antibodies (BioRad) were used, followed by detection by chemiluminescence 
(SuperSignal Pico, Pierce). 

Mitochondrial isolation, proteinase K protection and swelling assays. 
Mitochondria were isolated from HeLa cells or mouse liver through differential 
centrifugation as previously described‘. Mitoplasts were obtained through osmotic 
swelling by incubating mitochondrial fraction in 20 mM Tris-Cl pH 7.4 for 20 min 
on ice. The same amount of mitoplasts was treated with proteinase K (100 j1g/ml) at 
4°C for the indicated time, and the proteolytic reaction was quenched by addition 
of PMSF. Samples were then loaded on SDS-PAGE and processed for western blot, 
as described in ‘Antibodies, SDS-PAGE and western blot. Results are representative 
of at least two different proteolytic reactions. 

For the swelling assay, mitochondria were isolated using a slightly modified 

protocol to accelerate the procedure (as previously reported®, mitoK rp activity 
degrades quickly after organelle isolation). HeLa cells were initially disrupted 
through a brief (4-s) sonication (using a Braun Labsonic P at full cycle and 30% 
amplitude). After differential centrifugation, 0.25 mg of mitochondria was sus- 
pended in a cuvette containing 2 ml of swelling assay buffer (100mM KCl, 20 mM 
HEPES, 1 mM MgCh, 2 mM Pi, 1 mM EGTA, 0.1% BSA, 5mM succinate, 2.5mM 
glutamate, 2.5mM malate, 1j.M oligomycin, pH 7.2). After 1 min of incubation, 
absorbance at 520 nm was recorded using an Agilent Cary 100 UV-Vis spectro- 
photometer. Swelling rate was calculated as the decrease in absorbance over 1 
min of recording (using the SLOPE function of MS Excel). Two mitochondrial 
preparations were used. 
Co-immunoprecipitation. For the interaction between human MITOSUR-Flag 
or MITOSUR(K513A)-Flag and mouse MITOK-V5, HeLa cells were transfected 
with the indicated plasmids. After 48 h, cells were lysed in co-immunoprecipi- 
tation buffer (150mM NaCl, 1% digitonin, 50mM Tris-Cl pH7.4, 1mM EGTA- 
Tris pH 7.4 and complete EDTA-free protease inhibitor mixture). Lysates were 
centrifuged at 15,000g for 10 min, and supernatant was transferred into new 
tubes. One milligram of proteins was precleared using a control agarose resin 
(Thermo Fisher Scientific) for 30 min at 4°C. Precleared proteins were incubated 
with monoclonal anti-Flag-agarose-conjugated antibody (Sigma) for 3 h at 4°C. 
After 3 washes (of 10 minutes each) in co-immunoprecipitation buffer, 50 j1l of 
Laemmli buffer 2x was added to the resin and heated for 5 min at 95°C. The 
precleared lysate (input) and the immunoprecipitated (co-immunoprecipitation) 
fractions were separated and blotted as described in ‘Antibodies, SDS-PAGE 
and western blot. 

For the interaction between endogenous MITOK and MITOSUR, isolated mito- 
chondria from mouse liver or HeLa cells were lysed in co-immunoprecipitation 
buffer and processed as indicated in ‘Antibodies, SDS-PAGE and western blot: One 
milligram of precleared proteins were incubated with 5 jig of anti- MITOKy.term 
(Sigma HPA010980) antibody for 3 h. Protein A-sepharose beads (GE Healthcare) 
were added for 1 h and washed 3 times with co-immunoprecipitation buffer. Fifty 


ARTICLE 


microlitres of Laemmli buffer 2x was added to the resin and heated for 5 min at 
95°C. Results are representative of at least three independent transfections. 
Analysis of OPA1 oligomers. HeLa cells were treated with 1 mM BMH (Thermo 
Fisher Scientific) for 30 min at 37°C. After crosslinking reaction, cells were 
quenched and washed in PBS with 0.1% 8-mercaptoethanol (BME) twice. Cells 
were then lysed in RIPA buffer supplemented with BME and subjected to west- 
ern blot on NUPAGE Novex 3-8% Tris-acetate gradient gels (Thermo Fisher 
Scientific). The western blot provided in the figures is representative of three dif- 
ferent crosslinking reactions. 

Immunofluorescence and confocal imaging. HeLa cells were grown on 24-mm 
coverslips until 50% confluence. Cells were then washed with PBS, fixed in 4% for- 
maldehyde for 10 min and quenched with 50 mM NH,Cl in PBS. Cells were perme- 
abilized for 10 min with 0.1% Triton X-100 in PBS and blocked in PBS containing 
2% BSA for 1 h. Cells were then incubated with primary antibodies (anti-MITOK, 
anti- HSP60) for 3 h at room temperature and washed 3 times with 0.1% Triton 
X-100 in PBS. The appropriate isotype-matched Alexa-Fluor-conjugated secondary 
antibodies (Thermo Fisher Scientific) were incubated for 1 h at room tempera- 
ture and coverslips were mounted with ProLong Gold Antifade reagent (Thermo 
Fisher Scientific). Alternatively, cells were transfected with mitochondrial-targeted 
DsRed or mEmerald. One day later, cells were fixed and mounted as described. 
Images were acquired on a Leica TCS-SP5-II-RS-WLL equipped with a 100x, 
1.4.N.A. Plan-apochromat objective. Alexa Fluor 488 (or mEmerald) was excited 
by the 488-nm laser line and images were collected in the 495-535-nm range. Alexa 
Fluor 555 (or DsRed) was sequentially excited with the 543-nm laser line and signal 
was collected in the 555-600-nm range. Pixel size was set below 100 nm to meet 
the Nyquist criterion. For each image, a z-stack of the whole cell was acquired, 
with a step size of 130 nm. Images are presented as maximum projections of the 
whole stack using the Fiji image processing package based on Image]*°. Images are 
representative of at least three independent transfections. 

Analysis of mitochondrial morphology. HeLa cells were grown on 13-mm cover- 
slips until 50% confluence and transfected with a plasmid encoding for a mitochon- 
drially targeted mEmerald protein. After 36 h, cells were washed 3 times with PBS 
and treated with and incubated in a buffer based on Krebs-Ringer modified buffer 
(KRB) that contained 5.5 mM 2-deoxyglucose. After 0, 15 or 60 min, cells were 
fixed and processed for confocal imaging as described in ‘Immunofluorescence 
and confocal imaging’ Images (single planes) were then analysed using a custom 
Image] script. In brief, background- and noise-corrected images were thresholded, 
and objects were counted with the ‘Analyze particles’ function (using 0.2 jum? as 
a lower cutoff). Objects were then classified as fragmented (circularity >0.8 or 
length <3 jum), elongated (circularity <0.2 or length >6 j1m) or intermediate (all 
other cases). Finally, for each cell, the area occupied by elongated, intermediate and 
fragmented mitochondria was normalized on the global mitochondrial area and 
expressed as percentage. At least 20 cells were analysed per condition (more than 
50 objects were counted in each cell) from 3 independent transfections. 

Analysis of ROS production. HeLa cells were plated on 96-well black plates and 
grown until full confluency. Cells were then incubated for 45 min in a KRB-based 
buffer containing 5.5 mM glucose supplemented with 0.02% Pluronic F127 and 
5 uM CM-H2DCEDA. Cells were washed twice with PBS and incubated in KRB- 
based buffer supplemented as indicated in ‘Analysis of mitochondrial morphol- 
ogy. Fluorescence (excitation 485/10 nm, emission 530/30 nm, recorded from 
the bottom of the plate) was monitored on a Perkin Elmer Envision multi-mode 
plate reader operating at 37 °C in well-scan mode. Blank was subtracted using two 
wells containing unstained cells. Fluorescence was recorded every 60 min over 16 
h. Total fluorescence was calculated for each well at each time point and the rate 
of fluorescence increase was calculated using the SLOPE function in MS Excel. At 
least ten wells were analysed from three independent experiments. 

[Ca**] mt measurements. HeLa cells were grown on 13-mm round glass coverslips 
at 50% confluence and cotransfected with a low-affinity mitochondrially targeted 
aequorin-based probe (mtAeqMut)** together with the indicated plasmid (the 
mock vector pcDNA3.1 was used as a control). Twenty-four or thirty-six hours 
after transfection, cells were incubated with 5 .M coelenterazine for 1-2 hin KRB 
(125 mM NaCl, 5 mM KCl, 1 mM Na3PO,, 1 mM MgSOxg, 5.5 mM glucose, 20 mM 
HEPES, pH 7.4) at 37°C supplemented with 1 mM CaCl, and then transferred 
to the perfusion chamber. All aequorin measurements were carried out in KRB. 
Agonists and other drugs were added to the same medium as specified in text and 
figures. The experiments were terminated by lysing cells with 100 1M digitonin 
in a hypotonic Ca?+-rich solution (10 mM CaCl, in H2O), thus discharging the 
remaining aequorin pool. The light signal was collected and calibrated into [Ca”*] 
values by an algorithm based on the Ca” response curve of aequorin at physio- 
logical conditions of pH, [Mg**] and ionic strength, as previously described“. 
Alternatively, [Ca?+] measurements were carried out on a Perkin Elmer Envision 
plate reader equipped with a two-injector unit. Cells were transfected as described 
in ‘Chemicals, cell culture and transfection’ in 24-well plates, and then replated into 
96-well plates (1:5 dilution) the day before the experiment. After reconstitution 


with 5M coelenterazine, cells were placed in 70 jl of KRB and luminescence 
from each well was measured for 1 min. During the experiment, histamine 
was first injected at the desired concentration to activate calcium transients, 
and then a hypotonic, Ca”+-rich digitonin-containing solution was added to 
discharge the remaining aequorin pool. Output data were analysed and cali- 
brated with a custom-made macro-enabled Excel workbook. All the results are 
expressed as mean + s.d. and are representative of at least three independent 
transfections. 

AW,, measurements. The measurement of AY, is based on the distribution of the 
mitochondrion-selective lipofilic cation dye TMRM (Thermo Fisher Scientific). 
Cells were loaded with 2 0 nM TMRM for 30 min at 37°C and then transferred 
to the imaging system. Images were acquired on a Zeiss Axiovert 200 microscope 
equipped with a 40x, 1.3 N.A. PlanFluor objective. Excitation was performed 
with a Deltaram V high speed monochromator (Photon Technology International) 
equipped with a 75 W Xenon Arc lamp. Images were captured with a high- 
sensitivity Evolve 512 Delta EMCCD (Photometrics). The system is controlled by 
Metamorph 7.5 and was assembled by Crisel Instruments. TMRM excitation was 
performed at 560 nm and emission was collected through a 590-650-nm bandpass 
filter. Images were acquired every 5 s with a fixed 200-ms exposure time. At the end 
of each experiment, 10 1M CCCP was added to collapse AW,,. After background 
correction, the fluorescence value after addition of CCCP was subtracted for each 
cell. For the analysis of basal AW, data are presented as raw fluorescence values 
in resting conditions. For the analysis of AW, flashes, data are presented as time 
lapse of normalized fluorescence (F/Fo). Data were obtained from at least three 
independent preparations. All analyses were performed with the Fiji distribution 
of Image]. 

Transmission electron microscopy. HeLa cells were grown in 24-well plates 
and fixed with 2.5% glutaraldehyde in 0.1 M sodium cacodylate buffer pH 7.4 for 
1 hat 4°C, post-fixed with a mixture of 1% osmium tetroxide and 1% potassium 
ferrocyanide in 0.1 M sodium cacodylate buffer for 1 h at 4°C and incubated 
overnight in 0.25% uranyl acetate at 4°C. After three water washes, samples were 
dehydrated in a graded ethanol series and embedded in an epoxy resin (Sigma). 
Ultrathin sections (60-70 nm) were obtained with an Ultrotome V (LKB) ultra- 
microtome, counterstained with uranyl acetate and lead citrate, and viewed with 
a Tecnai G2 (FEI) transmission electron microscope operating at 100 kV. Images 
were captured with a Veleta (Olympus Soft Imaging System) digital camera. For 
structural quantification, the cristae width was measured from all mitochondria 
from 15 cells for each condition. 

OCR measurements. OCR measurements were performed in intact HeLa 
cells using the XF24 Extracellular Flux Analyzer platform (Agilent) according 
to manufacturer’s instructions. Cells were counted and plated on XF24 cell- 
culture plates. The following day, growth medium was replaced with pre-warmed 
unbuffered DMEM (Sigma) and equilibrated for 1 h at 37°C. Oligomycin 
(2 4M), FCCP (0.4 1M), rotenone (0.5 j1M) and antimycin A (0.5 1M) 
were dissolved in assay medium and loaded on sensor cartridge ports. OC) 
was detected under basal conditions followed by the sequential addition of 
the indicated drugs*’. 

Cell viability. HeLa cells of the indicated genotype were counted and plated in 
a 96-well plate. After 36 h, growth medium was replaced with KRB containing 
the indicated HO, concentration. After 2 h, PrestoBlue assay (Thermo Fisher 
Scientific) was performed according to manufacturer's instructions. Data are 
presented as percentage absorbance at 570 nm (600 nm was used as a reference 
wavelength) relative to untreated cells. Three independent experiments with 16 
replicates each were performed. 

Generation of MITOK-knockout mice. Mitok-knockout mice were gener- 
ated by genOway on a C57BL/6N background. Two LoxP sites flanking exon 4 
of the mouse Mitok gene were introduced by homologous recombination. The 
genotype was verified by PCR with the following primer: Mitok knockout, fw 1 
GCACCTTGTCAGCACCATGACAACTC; Mitok knockout, fw 2 GAGGGA 
TCGCTGTGGAAGGCTGTAT; and Mitok knockout, ry GCGGACAAAGATTGT 
GTCACTGTTTGC. 

The knockout allele yields an amplification product of 769 bp, whereas the 
wild-type allele generates a 278-bp fragment. All mouse experiments were per- 
formed in accordance with the Italian law D.L.vo n_26/2014 and approved by local 
(Organismo preposto al benessere degli animali, O.P.B.A.) and national (Ministry 
of Health) committees (376-2015PR). 
86Rb* uptake measurements. After isolation, 200 mg of mitochondria from wild- 
type and Mitok-knockout mouse liver were resuspended in swelling buffer (100 mM 
KCl, 20 mM HEPES, 1 mM MgCh, 2 mM Pi, 1 mM EGTA, 0.1% BSA, 5 mM succi- 
nate, 2.5 mM glutamate, 2.5 mM malate, 1 |.M oligomycin, pH 7.2) containing trace 
amount (1-2 Ci) of RbCl. Where indicated, the buffer was supplemented with 
ATP (2 mM) and diazoxide (50 1M). After 1, 10, 20 and 30 min, mitochondria were 
rapidly centrifuged and washed. The amount of *°Rb* trapped within the organelle 
was estimated by scintillation counting and normalized on protein content. Rbt 


influx rate was calculated as the relative increase in isotope content over time 
(using the SLOPE function of MS Excel). Results are expressed as mean + s.d. of 
three independent experiments. 

Ischaemia-reperfusion experiments. Adult (4-month-old) wild-type and 
MITOK-knockout male mice were anaesthetized by intraperitoneal injection of 
Zoletil 100 (30 mg/kg). Hearts were perfused with bicarbonate buffer gassed with 
95% O2-5% CO2 at 37°C (pH 7.4) at a constant flux of 5 ml/min. Perfusion was 
performed in the nonrecirculating Langendorff mode, as previously described’. 
The perfusion buffer contained (in mM) 118.5 NaCl, 3.1 KCl, 1.18 KH2PO4, 
25.0 NaHCO, 1.2 MgCh, 1.4 CaCl, and 5.6 glucose. Hearts were treated as follows 
(n > 5 heart per group): after 10 min of normoxic stabilization, hearts were sub- 
jected to 40 min of global no-flow ischaemia followed by 15 min of reperfusion. 
Pharmacological preconditioning was carried out by perfusion in the presence of 
diazoxide (30 1M) for 10 min, followed by the ischaemia-reperfusion protocol in 
the absence of diazoxide. After reperfusion, hearts were quickly immersed into 
PBS containing 0.5% Triton X100 and homogenized for measurement of LDH. 
For TTC staining, hearts were subjected to the ischaemia-reperfusion protocol 
and frozen at —20°C until used for quantification of myocardial infarct size. The 
hearts were cut into 5 transverse slices, incubated with TTC (1% w/v, pH 7.4) for 
20 min at 37°C and fixed overnight in 4% formaldehyde at 4°C. The slices were 
digitally photographed. The infarcted tissue stains a characteristic white colour, 
whereas the viable tissue stains red. The infarct area was expressed as percentage 
of total area minus cavities, and was calculated using ImageJ. The whole heart is 
exposed to ischaemia in Langendorff mode, and thus there is no need to normalize 
on area-at-risk. 

Measurement of LDH activity. To determine the amount of LDH released from 
the hearts exposed to ischaemia-reperfusion, coronary effluent was collected at 
1-min intervals during the 15 min of reperfusion, as previously described’. At the 
end of reperfusion, hearts were homogenized for assessing the residual activity 
of LDH in the whole tissue. LDH activity was determined by means of a classic 
procedure. Because all values were normalized to heart weight, the amount of LDH 
released was expressed as the percentage of total (that is, effluent and homogenate) 
to rule out possible changes owing to variations in heart size*’. 

Statistical analysis of data. In bar graphs, data are presented as mean + s.d. unless 
specified. For box plots, the boundary of the box closest to zero indicates the 25th 
percentile, the line within the box marks the median, and the boundary of the box 
farthest from zero indicates the 75th percentile. Whiskers (error bars) above and 
below the box indicate the 90th and 10th percentiles, respectively. Dots repre- 
sent outlying points. Variance was calculated by one-way, two-way or three-way 
ANOVA as indicated in the legends, and multiple comparisons were assessed using 
the Holm-Sidak post hoc test. Where applicable, data points and exact P values 
are indicated in Source Data. All analyses were performed with the SigmaPlot 12.0 
(Systat Software) or Excel (Microsoft). 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 

Source Data tables are provided for Fig. 3d, e, 4a—d, f, 5a, b, d and Extended Data 
Figs. 1d-f, 2b, g3, 5a, b, i, 7d, 8a—c, f-h. All other data supporting the findings of 
this study are available from the corresponding authors on request. 
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Extended Data Fig. 1 | Overexpression of mouse MITOK causes 
mitochondrial dysfunction. a, Membrane (pellet) and soluble 
(supernatant) proteins were separated from isolated liver mitochondria 
using ice-cold 0.1 M Na,CO; (pH 11.5). Western blot is representative 
of three independent experiments. b, Proteinase K protection assay 

in isolated liver mitochondria. Similar results were obtained in three 
independent reactions. c, Mitochondrial morphology of control and 
MITOK-overexpressing HeLa cells. Representative of five independent 


experiments. Scale bar, 10 jum. d, Representative images and average + s.d. 


traces of control and MITOK-GFP- expressing HeLa cells loaded with 
TMRM. n = 9 biologically replicates from 3 independent experiments. 
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e, [Ca?*] mt measurements (mean + s.d.) in intact HeLa cells that express 
the indicated constructs. n = 4 biological replicates, representative of 3 
independent experiments. *P < 0.001 using one-way ANOVA with Holm- 
Sidak correction. f, qPCR analyses of transcripts from HeLa cells or the 
indicated human tissues using specific (isoform 1 and isoform 2) or non- 
specific (pan) primer pairs for MITOK. Data are normalized to ACTB 
and expressed as mean + s.d. For HeLa cells, n = 3 biologically 
independent samples. For human tissues, n = 3 technical replicates. 

g, Protein expression of MITOK isoforms in HeLa cells transfected with 
the indicated constructs. Asterisk indicates a non-specific band. Image is 
representative of two independent experiments. 
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Extended Data Fig. 2 | Localization and function of human experiments. Scale bar, 10 jm. b, [Ca?*] mt measurements (mean + s.d.) in 
MITOK isoforms. a, Immunolocalization of MITOK (green) and the intact HeLa cells that express the indicated constructs. n > 6 independent 
mitochondrial marker HSP60 (cyan) in HeLa cells transfected with samples, *P < 0.001 using one-way ANOVA with Holm-Sidak correction. 
the indicated constructs. Images are representative of two independent 
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Extended Data Fig. 3 | See next page for caption. 


ARTICLE 


Extended Data Fig. 3 | MITOK is a cation channel. a, b, Western blots independent experiments. f, Voltage ramp (from -120 mV to 0 mV) of 
and Coomassie staining of MITOK expressed and purified from E. coli MITOK recorded in 100 mM K-gluconate symmetrical medium (n = 3 

(a) or WGL (b). Representative of three independent experiments. independent experiments). g, Single-channel I-V curves under symmetric 
c, Current traces showing two channels gating together, resulting in (black) and asymmetric (grey) ionic conditions. Mean value + s.d., 
flickering activity (top), or normal single-channel activity (bottom). n = 50 from 3-4 different experiments for each point. Fitting revealed an 
The two traces were recorded in the same experiment, performed in Erey = -21.6 + 1 mV. n = 4 independent experiments. h, Representative 
100 mM K-gluconate medium. Representative of five independent traces (top) and amplitude histograms (bottom) before and after the 
experiments. d, Current trace showing burst-like activity. Recording was addition of 2 mM Ba** (n = 4 independent experiments) in 100 mM KCl 
performed in 100 mM K-gluconate medium. Similar activity was present medium. i, Representative traces (top) and amplitude histograms (bottom) 
in more than six recordings. e, Representative traces of MITOK channel of MITOK activity before (control) and after addition of paxilline (40 1M). 


activity at the indicated voltages. Similar results were obtained in three n = 4 independent experiments. 
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Extended Data Fig. 4 | MITOK alone is insensitive to ATP. a, Activity 
of MITOK in 100 mM Na-gluconate. n = 5 independent experiments. 
b, Current traces of MITOK channel activity obtained from 60-s 
recordings (in 100 mM K-gluconate) before (top) and after (bottom) 
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addition of 2 mM Mg and ATP. Representative of eight independent 
experiments. Voltages of the cis side are reported. c, Representative traces 
(left) and amplitude histograms (right) of MITOK activity before (control) 
and after the addition of 5-HD (100 1M). n = 5 independent experiments. 
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Extended Data Fig. 5 | See next page for caption. 
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Extended Data Fig. 5 | Biophysical characterization of recombinant 
human MITOSUR and mouse MITOK. a-c, Thermal shift assay analysis 
of human MITOSUR and mouse MITOK. Average curves (a), graphs of 
Taggr (b, expressed as mean + s.d.) and western blot (c). Representative 
of four independent experiments. d, Membrane extraction and western 
blot (representative of two independent experiments) of in vitro co- 
expressed human MITOSUR and mouse MITOK incorporated into 
liposomes. e, f, Membrane topology assessed by proteinase K protection 
assay in reconstituted liposomes and probed for mouse MITOK (e) 

and human MITOSUR (f). g, The same experiment shown in Fig. 2a is 
here represented with a different time scale. h, Amplitude histograms of 


channel activity before (control) and after first addition of 500 1M Mg and 


ATP, the second addition of 500 1M Mg and ATP and the third addition 


of 80 1M diazoxide. Similar results were obtained with four independent 
preparations. Open probabilities over a period of 120 s were: control, 
0.62; 0.5 mM Mg/ATP, 0.14; 1 mM Mg/ATP, 0; diazoxide, 0.75. i, Single- 
channel current (I)-voltage (V) relationship of human MITOSUR and 
mouse MITOK. Linear fitting revealed a chord conductance of 63 + 3 pS. 
n= 4 independent experiments. j, Activity in the absence (control, top) 
and presence of 1 mM Mg”* (bottom) in 100 mM K-gluconate medium. 
n = 3 independent experiments. k, Activity of human MITOSUR and 
mouse MITOK in 100 mM K-gluconate, 5 mM EDTA, 10 mM HEPES, 
pH 7.4 n = 3 independent experiments. ], Human MITOSUR and mouse 
MITOK channel activity in 100 mM Na-gluconate medium. n = 4 
independent experiments. 
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Extended Data Fig. 6 | MITOK and MITOSUR interact in situ. 

a, Proteinase K protection assay in isolated HeLa mitochondria. 

Similar results were obtained in two independent reactions. 

b, Co-immunoprecipitation of endogenous MITOK using mitochondria 
isolated from HeLa cells. FT, flow-through fraction; W3, third (last) 


co-immunoprecipitation wash. Representative of two independent 
experiments. c, Co-immunoprecipitation between overexpressed mouse 
MITOK and mutant human MITOSUR(K513A). Representative of two 
independent experiments. 
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Extended Data Fig. 7 | Genetic ablation of MITOK in HeLa cells. 

a, Schematic of the MITOK gene. The expanded regions were used 

to design Cas9 guides (highlighted in red). b, Western blot of wild- 

type and MITOK-knockout HeLa cell lines. Representative of three 
independent experiments. c, Mitochondrial morphology in wild-type 
and MITOK-knockout HeLa cells. Scale bar, 10 jum. Asterisks are located 
near doughnut-shaped mitochondria. Similar results were obtained 

in five independent experiments. d, e, AY, measurements in control 
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and MITOK-knockout cells. Cells were loaded with TMRM, and 
normalized fluorescence in different regions was monitored through 
time. d, Representative traces of single mitochondria. e, Pseudo-coloured 
representative images of a HeLa cell knockout for MITOK, loaded with 
TMRM at the indicated time points. Similar results were obtained in four 
independent experiments. f, Western blot in HeLa cells of the indicated 
genotype. Representative of two independent experiments. 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | Loss of MITOK causes mitochondrial 
dysfunction. a, OCR measurements in wild-type and MITOK-knockout 
HeLa cells treated with either vehicle or 1 pM valinomycin for 1 h. 
Representative of three independent experiments. b, OCR measurements 
in wild-type and MITOK-knockout HeLa cells transfected with control or 
mitoKarp-expressing (MITOSUR-P2A-MITOK) plasmids. Representative 
of three independent experiments. c, Maximal cristae width in the 
indicated genotype. n > 12 individual cells (approximately 20 cristae per 
cell were measured) from 2 independent preparations. *P < 0.013 using 
two-way ANOVA with Holm-Sidak correction. d, OPA1 crosslinking 
(using 1 mM BMH) in wild-type and MITOK-knockout cells. Similar 
results were obtained in three independent experiments. e, f, Extracellular 


acidification rate (ECAR) (e) and OCR (f) measurements in intact cells 

of the indicated genotype. n = 5 biological replicates, representative of 2 
independent experiments. g, ROS production during energy stress. Cells 
were incubated in 5.5 mM of either glucose or 2-deoxyglucose in the 
presence or absence of 30 1M diazoxide, and fluorescence was monitored 
for 16 h. Box plots indicate the rate of ROS production over this time 
frame. n > 10 biological replicates from 3 independent experiments. 

*P < 0.05 using three-way ANOVA with Holm-Sidak correction. h, Cell 
death analysis in HeLa cells treated with 0, 100 or 500 1M HO. Data are 
normalized to the untreated condition, and expressed as mean + s.d.n = 3 
independent experiments. *P < 0.003 using two-way ANOVA with Holm- 
Sidak correction. 
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BCAA catabolism in brown fat controls 
energy homeostasis through SLC25A44 
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Kenjilkeda!*8, Huixia Li+?3, Ayano Ueno’, Maki Ohishi>, Takamasa Ishikawa®, Kyeongkyu Kim!?°, Yong Chen!, 


Carlos Henrique Sponton!**, Rachana N. Pradhan!**, Homa Majd?, Vanille Juliette Greiner, Momoko Yoneshiro 


1,2,3 
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Zachary Brown!?°, Maria Chondronikola”, Haruya Takahashi!!, Tsuyoshi Goto!'", Teruo Kawada"!, Labros Sidossis”, 


Francis C. Szoka®, Michael T. McManus)’, Masayuki Saito!’, Tomoyoshi Soga° & Shingo Kajimura 
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Branched-chain amino acid (BCAA; valine, leucine and isoleucine) supplementation is often beneficial to energy 
expenditure; however, increased circulating levels of BCAA are linked to obesity and diabetes. The mechanisms of this 
paradox remain unclear. Here we report that, on cold exposure, brown adipose tissue (BAT) actively utilizes BCAA in the 
mitochondria for thermogenesis and promotes systemic BCAA clearance in mice and humans. In turn, a BAT-specific 
defect in BCAA catabolism attenuates systemic BCAA clearance, BAT fuel oxidation and thermogenesis, leading to diet- 
induced obesity and glucose intolerance. Mechanistically, active BCAA catabolism in BAT is mediated by SLC25A44, 
which transports BCAAs into mitochondria. Our results suggest that BAT serves as a key metabolic filter that controls 
BCAA clearance via SLC25A44, thereby contributing to the improvement of metabolic health. 


In addition to the well-known function of BAT as a thermogenic organ, 
studies using positron emission tomography-computed tomography 
(PET-CT) with !8F-fluorodeoxyglucose (!8F-FDG) and fatty-acid tracers 
have demonstrated that BAT also serves as a metabolic sink for glucose 
and fatty acids'"*. This function is tightly coupled with the ability to 
improve metabolic health: cold acclimatization stimulates uptake of 
glucose, triglyceride-rich lipoproteins and fatty acids in BAT, thereby 
contributing to improved systemic lipid metabolism*”. It remains 
unknown, however, whether BAT contributes to the clearance of any 
other metabolites and how such processes are regulated. Accordingly, 
we performed an unbiased metabolite analysis on sera from healthy 
human subjects (male, aged 23.4 + 0.6 years old (all results are shown 
as mean + s.e.m.), n = 33) with high BAT activity (standardized uptake 
value (SUV) > 4.03, n = 17) and low BAT activity (SUV < 4.03, n = 16) 
at 27°C (thermoneutral) and following cold exposure (19°C) for 2h 
(Supplementary Table1). Subjects with SUV > 4.03 were considered 
as the high-BAT group, on the basis of the median of the subjects in 
the study (Fig. 1a). The cold stimulus of 19°C was selected on the basis 
that BAT thermogenesis is stimulated at 19°C in adults without trigger- 
ing skeletal muscle shivering (Extended Data Fig. 1a). Cold exposure 
stimulated lipolysis in adipose tissue, leading to a significant increase in 
circulating levels of non-esterified fatty acids in both groups, whereas 
cold exposure did not change blood glucose levels (Extended Data 
Fig. 1b, c). 


Cold-activated BAT promotes systemic BCAA clearance 

Unexpectedly, we found that serum concentration of Val was signif- 
icantly reduced, preferentially in high-BAT subjects following cold 
exposure, whereas no significant change was seen in low-BAT sub 


jects (Fig. 1b). The cold-induced reduction in serum Val concentrations 
showed a significant inverse correlation with BAT activity measured 
by '8F-FDG-PET imaging (Fig. 1c). Similarly, cold-induced changes in 
Leu and total BCAA levels were inversely correlated with SUV, whereas 
no amino acids except Val and Leu showed a significant correlation 
(Fig. 1d, Extended Data Fig. 1d, Supplementary Table 2). Although 
skeletal muscle is a major organ that utilizes BCAA, there was no cor- 
relation of muscle mass with cold-induced changes in BCAA levels 
(Extended Data Fig. le). Consistent with the human study, plasma 
metabolomics in obese mice showed that cold exposure significantly 
reduced plasma Val, Leu and Ile levels (Fig. le, Extended Data Fig. 1f). 

These observations caught our attention because epidemiological 
studies have demonstrated that increased circulating BCAA levels are 
strongly associated with obesity, insulin resistance and type 2 diabetes 
in humans and rodents*®, despite the fact that BCAA supplementation 
in healthy subjects is often associated with beneficial effects on muscle 
growth and energy expenditure’. Expression or activity of mitochon- 
drial BCAA enzymes, such as the branched-chain a-keto acid dehy- 
drogenase (BCKDH) complex, in the white adipose tissue (WAT) is 
reduced in obese and diabetic states! 13, and transplantation of WAT 
from wild-type mice into branched-chain aminotransferase (BCAT2)- 
deficient mice reduces circulating BCAA levels’, suggesting that adi- 
pose tissue contributes to the regulation of circulating BCAA levels. 
The extent to which cold acclimatization controls systemic BCAA 
homeostasis via BAT remains unknown. 

Thus, we visualized Leu uptake in BAT using a PET-CT scan with 
18F_fluciclovine, a Leu- analogue tracer. Following cold acclimatization, 
'8F_fluciclovine-PET-CT detected a robust increase in '8F-fluciclovine 
uptake in the BAT and a modest increase in the inguinal WAT of mice 
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®Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA. 7Duke Molecular Physiology Institute, Duke University, Durham, NC, 
USA. ®Department of Molecular Endocrinology and Metabolism, Tokyo Medical and Dental University, Tokyo, Japan. °Department of Microbiology and Immunology, University of California, San 
Francisco, San Francisco, CA, USA. !°Center for Human Nutrition, Washington University in St Louis, St Louis, MO, USA. !!Laboratory of Molecular Function of Food, Graduate School of Agriculture, 
Kyoto University, Uji, Japan. !*Department of Kinesiology and Health, School of Arts and Sciences, Rutgers University, New Brunswick, NJ, USA. '2Department of Biomedical Sciences, Graduate 
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Time after injection of 
18F_Fluciclovine (min) 
Fig. 1 | Cold-induced BAT thermogenesis promotes systemic BCAA 
clearance in mice and humans. a, Left, !8F-EDG-PET-CT images of 
human experimental subjects following cold exposure. Right, SUV of 
18F-EDG in the BAT deposits. n = 17 (high BAT), n = 16 (low BAT). 

b, Circulating Val concentration in subjects in a at 27 °C (thermoneutral) 
and at 19°C (cold). c, Correlation between BAT activity and cold-induced 
changes in serum Val concentration in a. d, Correlation between cold- 
induced amino acid changes and BAT activity (y axis) against the degree 
of BAT-dependent amino acid changes (x axis) in a. e, Cold-induced 
changes in plasma amino acids in diet-induced obese mice at 30°C 
(thermoneutral (TN), 2 = 5) or 15°C (cold, n = 6). f, !®F-Fluciclovine- 
PET-CT images of mice acclimatized to 30°C (TN) or 15°C (cold) for two 


(Fig. 1f, g). Whereas the basal SUV in the liver and heart was higher 
than in BAT, no significant change was seen in these organs follow- 
ing cold exposure (Extended Data Fig. 2a). Consistent with a recent 
study'*, we found that BAT displayed the highest Val oxidation on 
cold exposure, relative to other metabolic organs including inguinal 
WAT, epididymal WAT and gastrocnemius muscle of mice (Extended 
Data Fig. 2b, c). Furthermore, Val oxidation in differentiated human 
brown adipocytes was significantly higher than in white adipocytes and 
was further enhanced by noradrenaline (Extended Data Fig. 2d). Of 
note, transcriptomics and proteomics data from mice and humans!*~!” 
showed that more than 60% of genes encoding BCAA catabolic 
enzymes, including the gene for the rate-limiting enzyme BCAT2, were 
more highly expressed in brown adipocytes relative to white adipocytes 
(Extended Data Fig. 2e, f). Our previous analysis in humans? also found 
that the BCAA catabolic pathway was highly and selectively induced 
by cold exposure in the supraclavicular BAT but not in the abdominal 
WAT (Extended Data Fig. 2g). Notably, BCAA is oxidized primarily 
in the mitochondria of BAT; BAT predominantly expresses the mito- 
chondria-localized form BCAT2, but not the cytosolic isoform BCAT1 
(Extended Data Fig. 2h, i). Despite this knowledge, the mitochondrial 
transporter for BCAAs is unidentified, and it remains unknown how 
BCAAs are utilized in brown adipocytes. 


Time in cold (h) 


weeks. Arrows indicate interscapular BAT. g, SUV of !8F-fluciclovine in 
BAT. n = 5 per group. h, Morphology and haematoxylin and eosin (H&E) 
staining of interscapular BAT of Pparg’“?!-KO and control mice. Scale 
bars, 50 jum. Representative result from two independent experiments. 

i, Plasma levels of BCAA in h during cold temperature (12°C). n = 7 per 
group. a-i, Biologically independent samples. Data are mean + s.e.m.; 
two-sided P values by paired t-test (b), unpaired Student's t-test (e), or 
two-way repeated measures analysis of variance (ANOVA) (g) followed 
by post hoc paired or unpaired t-tests with Bonferroni's correction (i). 
Pearson's or Spearman’s rank correlation coefficient was calculated, as 
appropriate (c, d). 


To determine whether BAT contributes to systemic BCAA clear- 
ance, we generated a BAT-ablation mouse model in which peroxisome 
proliferator-activated receptor- (PPAR-+) was deleted in uncoupling 
protein 1 (UCP1)-expressing thermogenic adipocytes (Pparg@! 
knockout (KO), Ucp 1-cre;Ppargl*/fex), In contrast to littermate controls 
(Ppargl"*“e), the presumptive BAT in Pparg’?!-KO mice was com- 
posed of unilocular adipocytes and fibrotic tissues (Fig. 1h). Following 
cold exposure, plasma BCAA concentration was significantly reduced 
in control mice but not in Pparg’“?!-KO mice (Fig. 1i). 


BAT-specific BCAA defect impairs energy homeostasis 

To examine the extent to which BCAA catabolism in BAT regulates 
energy homeostasis, we next generated a mouse model in which BCAA 
oxidation is impaired specifically in the BAT (Bckdha’“!-KO mice, 
Ucp1-cre;Bckdhal*/"*) (Fig. 2a, Extended Data Fig. 3a—c). Whereas 
no difference was seen in BAT mass and thermogenic gene expression 
between the genotypes on a regular diet (Extended Data Fig. 3d, e), the 
core-body temperature of BckdhaY?!_KO mice was significantly lower 
than that of controls after cold exposure without affecting muscle shiv- 
ering (Fig. 2b, Extended Data Fig. 3f). Tissue-temperature recording 
also detected impaired thermogenesis in the BAT of BckdhaY“!_KO 
mice following treatment with noradrenaline, whereas no change was 
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Fig. 2 | BCAA oxidation in BAT is required for BCAA clearance 

and energy homeostasis. a, Immunoblotting of BCKDHA in BAT of 
BckdhaY“!_KO and control (ctrl) mice. GAPDH was used as a loading 
control. Representative result from two independent experiments. Gel 
source data are presented in Supplementary Fig. 1. b, Rectal core body 
temperature following cold exposure at 8°C. n = 8 (control), n = 9 
(Bckdha”“?!-KO). c, Change in tissue temperature (temp) in BAT and 
muscle following treatment with noradrenaline (NA). n = 4 per group. 
d, Plasma BCAA levels at indicated time points after a BCAA oral gavage 
at 12°C. n = 8 per group. e, MPE of indicated metabolites derived from 
[U-!3C.]Leu in human brown adipocytes. Cells were treated with vehicle 


seen in muscle and liver temperature (Fig. 2c, Extended Data Fig. 3g). 
Notably, Bckdha"“?!_KO mice were intolerant to oral BCAA challenge 
compared with control mice (Fig. 2d, Extended Data Fig. 3h). Similarly, 
Bckdha’"!-KO mice displayed higher plasma BCAA levels than con- 
trols following cold exposure (Extended Data Fig. 3i). These results 
indicate that BCAA oxidation is required for BAT thermogenesis and 
systemic BCAA clearance. 

To examine how a cold stimulus alters BCAA utilization in brown fat, 
we next used capillary electrophoresis time-of-flight mass spectrome- 
try (CE-TOFMS) and performed Leu stable-isotope tracing in differ- 
entiated human brown adipocytes. The mole percentage enrichment 
(MPE) of tricarboxylic acid (TCA) cycle intermediates derived from 
[U-!3C¢]Leu was quantified following noradrenaline treatment for 1h 
(Extended Data Fig. 4a, Supplementary Table 3). We found that acute 
noradrenaline treatment significantly increased the MPE of TCA inter- 
mediates, including succinate (Fig. 2e, Extended Data Fig. 4b), although 
the fractional contribution of labelled Leu to the TCA cycle was rela- 
tively small. This rapid noradrenaline-stimulated BCAA oxidation was 
aligned with increased expression of many BCAA-oxidation enzymes 
in the BAT mitochondria within 8 h after cold exposure (Extended 
Data Fig. 4c, d) and the rapid oxidation of BCAA in BAT. Of note, Val 
supplementation rapidly increased oxygen consumption rate (OCR) in 
human brown adipocytes stimulated with noradrenaline (Extended 
Data Fig. 4e). The stimulatory effect requires the generation of the 
TCA-cycle intermediate succinate: inhibition of succinyl coenzyme 
A synthetase or succinate dehydrogenase by vanadate or malonate, 
respectively, blunted the Val effect on OCR (Extended Data Fig. 4f, g). 
We also found that supplementation of Val, Leu or Ile significantly 
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(veh) or noradrenaline for 1 h. n = 6 per group. aKG, o-ketoglutarate. 
f, Body weight of Bckdha¥!-KO (n = 15) and control (n = 13) mice 
on high-fat diet at ambient temperature. g, Glucose tolerance test of 
mice in f. h, Insulin tolerance test of mice in f. i, Glucose oxidation in 
BAT normalized to tissue mass. n = 3 mice per group. j, PDH activity 
in BAT of mice maintained at 12 °C for one week. n = 5 (control), n = 6 
(Bckdha’©?!-KO). b-j, Biologically independent samples. Data are 
mean + s.e.m.; two-sided P values by unpaired Student's t-test 

(e, i, j) or two-way repeated measures ANOVA (b-d, f) followed 

by post hoc unpaired t-test (g, h). 


enhanced noradrenaline-stimulated thermogenesis in brown adipo- 
cytes ina UCP 1-dependent fashion (Extended Data Fig. 4h, i). BCAA 
supplementation or pharmacological BCAT2 activation significantly 
increased brown fat respiration in a BCKDHA-dependent manner; 
the reduced respiration in Bckdha-deficient cells was not the result of 
a general mitochondrial defect, because succinate supplementation, 
but not a-ketoisovalerate (KIV), restored noradrenaline-stimulated 
thermogenesis in Bckdha-deficient brown adipocytes (Extended Data 
Fig. 4i, j). Previous studies report that BCAA catabolism fuels de novo 
lipogenesis by generating monomethyl branched-chain fatty acids 
(mmBCFAs), and that mmBCFA synthesis in BAT is activated after 
one month of cold acclimatization!*'’. Consistent with these studies, 
proteomics data showed that the expression of mmBCFA synthesis 
enzymes, including carnitine acetyltransferase, were increased after 
three-week cold acclimatization, whereas many BCAA oxidative 
enzymes in the mitochondria were rapidly induced within 8 h of cold 
exposure and subsequently downregulated (Extended Data Fig. 4c, d). 
These data suggest a dynamic shift in BCAA utilization during cold 
acclimatization in BAT; that is, acute cold exposure activates BCAA 
oxidation in the TCA cycle, whereas chronic cold gradually promotes 
mmBCFA synthesis. 

Next, we examined the degree to which a BAT-specific defect in 
BCAA catabolism influences whole-body metabolism. BckdhaY!-KO 
mice on a high-fat diet gained significantly more body weight than 
littermate controls, owing to increased adipose tissue and liver mass, 
but not to changes in lean mass or food intake (Fig. 2f, Extended 
Data Fig. 5a—c). Consistent with previous studies showing that BAT 
thermogenesis controls hepatic triglyceride clearance*”®, the livers 
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Fig. 3 | Identification of SLC25A44 as a mitochondrial BCAA 
transporter. a, Expression profile of SLC25A family members in human 
supraclavicular BAT and abdominal subcutaneous WAT from the same 
individual at 27°C and 19°C (ref. °). FPKM, fragments per kilobase 

of transcript per million mapped reads. b, Correlation of expression 

of SLC25A44 mRNA with that of UCP1 or BCKDHA in human BAT. 
Expression in thermoneutral (red) or cold (blue) conditions from six 
biologically independent subjects. r, Pearson's correlation coefficient. 

c, SLC25A44 protein expression in indicated tissues of mice. GAPDH 

was used as a loading control. Representative result from two independent 


of Bckdha’“"! -KO mice contained significantly higher levels of tri- 
glycerides than those of controls (Extended Data Fig. 5d). Of note, 
BckdhaV“!_KO mice exhibited increased systemic glucose intol- 
erance and insulin resistance compared with controls (Fig. 2g, h). 
Furthermore, glucose oxidation in the BAT of BckdhaY!_KO mice was 
significantly reduced relative to controls (Fig. 2i). Fatty acid oxidation 
in the BAT of Bckdha’?!_-KO mice was modestly reduced relative to 
controls (Extended Data Fig. 5e). The impaired glucose oxidation in 
Bckdha’“"!-KO mice was associated with reduced pyruvate dehydroge- 
nase (PDH) activity in the BAT and inguinal WAT, and with increased 
phosphorylation of the El subunit of PDH at $300 and, to a lesser 
degree, at S293 (Fig. 2j, Extended Data Fig. 5f-h). 


SLC25A44 mediates mitochondrial BCAA transport 

Recognizing the role of BCAA catabolism in BAT thermogenesis, we 
next sought to answer the long-standing question: how do cells take up 
BCAAs into the mitochondria? As described earlier, brown adipocytes 
in humans and mice predominantly express the mitochondria-localized 
isoform BCAT2 in preference to BCAT1, but the mitochondrial BCAA 
transporter remains uncharacterized. Therefore, we hypothesized that 
thermogenic adipocytes would express a mitochondrial BCAA trans- 
porter. Members of the SLC25A family are promising candidates for 
this role, because many of the mitochondrial amino acid transporters 
belong to this family of solute carrier transporter proteins”. In addition 
to the carnitine—acylcarnitine translocase SLC25A20 and the glutamate 
carrier SLC25A22, transcriptome analyses identified two uncharac- 
terized SLC25A members, SLC25A39 and SLC25A44 that were abun- 
dantly expressed in mouse and human BAT (Fig. 3a, Extended Data 
Fig. 6a). Expression of SLC25A44, but not SLC25A39 mRNA in the 
human supraclavicular BAT was significantly increased after cold 
exposure and showed a positive correlation with UCP1 and BCKDHA 


experiments. Gel source data are presented in Supplementary Fig. 1. 

d, e, Mitochondrial uptake of indicated molecules in control and 
Slc25a44-KO brown adipocytes (d) or in Neuro2a cells expressing 
Slc25a44 or an empty vector (e). n = 3 biologically independent samples 
per group. f, [U-!4C,]Leu transport into mitochondrial liposomes from 
Slc25a44-KO brown adipocytes, expressing an empty vector (KO + vector) 
or Slc25a44 (KO + Slc25a44). n = 3 technically independent samples per 
group. Representative result from two independent experiments. Data are 
mean + s.e.m.; two-sided P values by unpaired Student's t-test (d, e) or 
two-way ANOVA (f). 


mRNA expression (Fig. 3b, Extended Data Fig. 6b). SLC25A44 protein 
was localized to the mitochondria and more highly expressed in the 
BAT compared with other metabolic organs (Fig. 3c, Extended Data 
Fig. 6c, d). In addition, SLC25A44 expression was increased during 
brown adipogenesis (Extended Data Fig. 6e-g). 

To determine the function of SLC25A44, we generated Slc25a44-KO 
brown adipocytes using CRISPR-Cas9 (Extended Data Fig. 7a). 
Mitochondrial BCAA uptake assays showed that Val and Leu uptake 
was selectively and significantly reduced in Slc25a44-KO cells, whereas 
Slc25a44 deletion did not affect the mitochondrial uptake of other amino 
acids (Fig. 3d, Extended Data Fig. 7b, c). Similarly, depletion of Slc25a44 
by lentivirus short-hairpin RNAs (shRNAs) abrogated mitochondrial 
Val and Leu uptake, whereas S/c25a39 depletion did not affect Val and 
Leu uptake (Extended Data Fig. 7d, e). Conversely, ectopic expression of 
SLC25A44 in a neuroblastoma cell line (Neuro2a cells) with undetect- 
able endogenous SLC25A44 sufficiently and selectively restored mito- 
chondrial Val and Leu uptake (Fig. 3e, Extended Data Fig. 7f). 

To characterize SLC25A44 in a cell-free system, we prepared lipos- 
omes that were fused with the mitochondrial inner membrane from 
Slc25a44-KO brown adipocytes or Slc25a44-KO cells that ectopically 
expressed Slc25a44 (Extended Data Fig. 7g, h). We observed robust 
and rapid Leu uptake in the mitochondrial liposomes from SLC25A44- 
expressing cells that were preloaded with Leu and Glu, whereas there 
was no detectable Leu uptake in the control group (Fig. 3f, Extended 
Data Fig. 7i). There was no difference in Glu uptake between the two 
groups (Extended Data Fig. 7j). As an alternative cell-free system, 
we reconstituted proteoliposomes by fusing liposomes with purified 
SLC25A44 protein (Extended Data Fig. 7k, l). Consistent with the 
results from mitochondrial liposomes, we detected active Leu uptake 
into the proteoliposomes with purified SLC25A44 (Extended Data 
Fig. 7m). 
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Fig. 4 | SLC25A44 is required for BAT thermogenesis and BCAA 
catabolism. a, Expression of Slc25a44 mRNA in indicated tissues of 
Slc25a44°4"_KD and control mice. 1 = 4 per group. b, Immunoblotting 
of SLC25A44 in BAT of mice in a. B-actin was used as a loading control. 
Representative result from two independent experiments. Gel source data 
are presented in Supplementary Fig. 1. c, Morphology (top), H&E staining 
(middle) and GFP immunofluorescence (bottom) in BAT from a (DAPI 
was used for counter staining). Scale bars, 100 jum. Representative result 
from two independent mice. d, Tissue temperature of BAT and muscle 

in a following treatment with noradrenaline (arrows). n = 5 (control), 
n=7 (SIc25a44°“1_KD). e, Rectal core body temperature of Slc25a44-KD 
(n = 6) and control (n = 7) mice following cold exposure at 8°C. f, 

Val oxidation in indicated tissues normalized to tissue mass. n = 4 per 
group. g, Plasma BCAA levels in e following 8 h cold treatment at 8 °C. 

n = 6 per group. h, Noradrenaline-induced OCR normalized to total 


SLC25A44 is required for BCAA catabolism 

To determine the role of SLC25A44 in vivo, we used a modi- 
fied CRISPR system, using catalytically inactive Cas9 protein 
(dCas9) fused to Kriippel-associated box (KRAB) domain. Adeno- 
associated virus (AAV) expressing a guide RNA (gRNA) tar- 
geting Slc25a44 or enhanced green fluorescent protein (eGFP; 
control) was injected into the interscapular BAT of dCas9-KRAB 
mice that were generated by the site-specific integrase-mediated 
approach” (Extended Data Fig. 8a-c). This system enabled BAT- 
selective knockdown of SLC25A44 (SIc25a4424"-KD) (Fig. 4a, b, 
Extended Data Fig. 8d, e). We found that brown adipocytes in 
Slc25a44®47_KD mice contained larger lipid droplets than those in 
control mice (Fig. 4c). Moreover, noradrenaline-induced BAT ther- 
mogenesis in Slc25a44°“"-KD mice was significantly impaired rel- 
ative to controls without affecting muscle thermogenesis (Fig. 4d). 
Next, we generated transgenic mice expressing gRNA-targeting 
S1c25a44, which were subsequently crossed with dCas9-KRAB mice 
to generate SLC25A44-deficient (Slc25a44-KD) mice (Extended Data 
Fig. 9a, b). Transcriptional analyses detected no compensatory change 
in other SLC25A members in Slc25a44-KD brown fat (Extended Data 
Fig. 9c, d). Similar to $1c25a4424T_KD mice, the BAT of Slc25a44-KD 
mice contained larger lipid droplets and higher levels of triglycerides 
compared with controls, whereas the morphology of WAT, liver, and 
muscle of Slc25a44-KD mice was normal (Extended Data Fig. 9e, f). 
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protein in control and Slc25a44-KO brown adipocytes. n = 9 per group 
(control + Val, Slc25a44-KO + Val + KIV), n = 10 per group (KO + Val, 
KO + Val + succinate). i, Val oxidation in inguinal WAT-derived white 
adipocytes expressing an empty vector or S/c25a44 after noradrenaline 
treatment. n = 5 (vehicle), n = 6 (noradrenaline). j, A proposed model 

of BCAA catabolism in thermogenic adipose cells. Cold stimuli activate 
BCAA uptake and oxidation in the mitochondria of thermogenic 
adipocytes. Mitochondrial BCAA oxidation promotes BAT thermogenesis. 
This process requires SLC25A44, the mitochondrial BCAA transporter. 
SLC7A5, L-amino acid transporter 1. a, d-i, Biologically independent 
samples. Data are mean + s.e.m.; two-sided P values by unpaired Student’s 
t-test (a, f), one-way factorial (h) or two-way repeated measures ANOVA 
(d, e, g) followed by post hoc paired or unpaired t-test with Bonferroni’s 
correction (g) or Tukey’s test (h, i). 


Although we found no difference in the expression of Ucp1 and genes 
associated with the fatty acid synthesis and oxidation pathway between 
the two groups, the core body temperature of Slc25a44-KD mice was 
significantly lower than in controls following cold exposure without 
affecting muscle shivering (Fig. 4e, Extended Data Fig. 9g-i). Tissue- 
temperature recording confirmed that noradrenaline-stimulated BAT 
thermogenesis was impaired in Slc25a44-KD mice (Extended Data 
Fig. 9j). Furthermore, Val oxidation in the BAT of Slc25a44-KD mice 
was lower than controls, indicating that SLC25A44 is the primary 
BCAA transporter in BAT (Fig. 4f). Of note, cold exposure failed to 
lower plasma BCAA concentration in Slc25a44-KD mice (Fig. 4g). 
These results indicate that SLC25A44 is required for cold-stimulated 
BAT thermogenesis and systemic BCAA clearance in vivo. 

To determine the cell-autonomous function of SLC25A44 in brown 
adipocytes, we depleted SLC25A44 in human brown preadipocytes using 
lentiviral shRNAs that target SLC25A44 (Extended Data Fig. 10a, b). 
We found that SLC25A44 depletion caused a significant reduction in 
noradrenaline-induced OCR in the presence of Val (Extended Data 
Fig. 10c, d). Supplementation with KIV or succinate, which bypasses 
mitochondrial BCAA transport, restored noradrenaline-induced OCR 
in Slc25a44-KO cells indicating that depletion of SLC25A44 did not 
cause a general mitochondrial defect (Fig. 4h, Extended Data Fig. 10e). 
In addition, SLC25A44-depleted brown adipocytes displayed active 
mitochondrial respiration (Extended Data Fig. 10f, g). Conversely, 


overexpression of S/c25a44 in mouse inguinal WAT-derived adipocytes 
or C2C12 myotubes significantly increased mitochondrial Val uptake and 
oxidation and cellular respiration (Fig. 4i, Extended Data Fig. 10h—-m). 


Discussion 

The results of this study suggest the following model (Fig. 4j): in addi- 
tion to glucose and fatty acids, cold stimuli potently increase mito- 
chondrial BCAA uptake and oxidation in BAT, leading to enhanced 
BCAA clearance in the circulation. This process requires SLC25A44, a 
mitochondrial BCAA transporter in brown adipocytes. In turn, defec- 
tive BCAA catabolism in BAT results in impaired BCAA clearance and 
thermogenesis, leading to the development of diet-induced obesity and 
glucose intolerance. 

This model has important implications for the regulation of 
systemic BCAA metabolism in an obese or diabetic state, which results 
in impaired BAT activity and increased circulating BCAA in humans 
and rodents. It has been suggested that the accumulation of incompletely 
oxidized intermediates derived from BCAA oxidation, such as 3-hydrox- 
yisobutyrate, causes insulin resistance””>4, Conversely, lowering circu- 
lating BCAA levels by inhibiting the kinase BDK or overexpression of 
the phosphatase PPM1K in the liver improves glucose tolerance inde- 
pendently for body-weight loss in rats”. Furthermore, reduced mito- 
chondrial BCAA oxidation and subsequent intracellular accumulation 
of BCAA leads to constitutive activation of mTOR signalling, resulting 
in persistent IRS-1 phosphorylation by mTORC1 and inhibition of insu- 
lin signalling®*>”*, This study suggests a distinct yet non-mutually exclu- 
sive mechanism in which impaired BAT activity in conditions of obesity 
or diabetes reduces systemic BCAA clearance, whereas active BAT acts 
as a significant metabolic filter for circulating BCAA and protects 
against obesity and insulin resistance. Enhanced mitochondrial BCAA 
catabolism via SLC25A44 may serve as a promising strategy to improve 
systemic BCAA clearance and glucose homeostasis. 
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METHODS 


Human subjects. Thirty-three healthy young male volunteers were recruited in 
Sapporo, Japan to investigate the role of BAT in circulating BCAA clearance dur- 
ing cold exposure. All participants were carefully instructed regarding the study 
and provided written informed consent. The protocols were approved by the 
Institutional Research Ethics Review Board of Tenshi College (Sapporo, Japan) 
(UMIN000016361). Human BAT activity was assessed by !$F-FDG-PET-CT scan 
(Aquiduo; Toshiba Medical Systems) after the standardized non-shivering cold 
exposure, as reported previously’. All the subjects have fasted for 12 h before 
'8E-FDG-PET-CT scanning. Following cold exposure, the volunteers were given an 
intravenous injection of !$F-FDG (1.66-5.18 MBq per kg (body weight)) and sub- 
sequently stayed in the same cold room for another 1 h. BAT activity was assessed 
by measuring the SUV of '8F-FDG and Hounsfield Units from —300 to —10 in the 
supraclavicular region using Fusion software (Toshiba Medical Systems). On the 
basis of the median of BAT activity, subjects were divided into a high-BAT-activity 
group and a low-BAT-activity group. Arterialized blood samples were obtained 
from the same subject right before cold exposure and after 2 h cold exposure at 
19°C between 09:00 and 11:30. Sera were used for metabolite analysis. Amino acid 
levels were corrected for total amino acid levels by linear regression, since individ- 
ual variation in the most of amino acids (85.3%) can be explained by total amino 
acids. To minimize possible effects of seasonal variation of BAT activities, the 
study was performed from January to March, during which the monthly average 
of ambient temperature in Sapporo was between —3.5 and 2.1°C. 

Animals. All the mouse experiments in this study were performed following the 
guidelines established by the UCSF Institutional Animal Care and Use Committee. 
Adult males and female mice aged 8-16 weeks had free access to food and water 
and were caged at 23°C with 12-h light cycles and were used for the experiments. 
Mice were randomly assigned for the experimental groups at the time of purchase 
or weaning. For the generation of BAT-specific Bckdha-KO mice (BckdhaY"! 
mice), Bckdha-floxed mice were obtained from the European Mouse Mutant cell 
Repository (Bckdha'™!4EUCOMM)Hmgu) and crossed with Ucp1-cre mice”*®. For the 
generation of BAT-specific Pparg-KO mice, Pparg-floxed mice were obtained from 
the Jackson Laboratory (Stock #004584) and crossed with Ucp1-cre mice. Both 
knockout mice were on the C57BL/6 background. 

For metabolic studies, male Bckdha’“?!-KO and littermate control mice at eight 
weeks old were fed on a high-fat diet (HFD, 60% fat, D12492, Research Diets) at 
ambient temperature. Fat mass and lean mass were measured in mice on a high- 
fat diet for ten weeks by Body Composition Analyzer EchoMRI (Echo Medical 
Systems). For glucose tolerance test, the mice fed with high-fat diet for 10 weeks 
were fasted for 6 h (from 8:00 to 14:00) and injected intraperitoneally with glucose 
(1.5 g per kg (body weight)). For insulin tolerance test (ITT) experiments, the mice 
fed with high-fat diet for 11 weeks were fasted for 3 h (from 10:00 to 13:00) and 
injected intraperitoneally with insulin (0.875 U per kg (body weight)). Blood sam- 
ples were collected at the indicated time points, and glucose levels were measured 
using blood glucose test strips (Abbott). BCAA tolerance test was performed in 
male Bekdha’?'-KO and control mice on a high-fat diet for ten weeks. For BCAA 
clearance test, mice were exposed to cold temperature under the fasting condition, 
and blood samples were obtained at the indicated time points. For BCAA tolerance 
test, mice were received a single bolus of BCAA oral gavage (500 mg per kg (body 
weight); weight ratio: Val:Leu:lle, 1: 1.5: 0.8)? and were exposed to cold at 12°C 
under the fasting condition. Blood was collected at the indicated time points and 
total plasma BCAA levels were measured by using a commercially available kit 
(ab83374, Abcam). Independently, plasma BCAA levels after 3 h oral BCAA gav- 
age were quantified by flow-injection electrospray-ionization tandem mass spec- 
trometry and quantified by isotope-dilution technique using a method described 
previously*”. In brief, plasma samples were spiked with a cocktail of heavy-isotope 
internal standards (Cambridge Isotope Laboratories; CDN Isotopes), deproteinated 
with methanol, and esterified with butanol. Mass spectra for amino acid esters 
were obtained using neutral loss scanning methods. Ion ratios of analyte to the 
respective internal standard computed from centroided spectra were converted to 
concentrations using calibrators constructed from authentic amino acids (Sigma; 
Larodan) and dialysed fetal bovine serum (Sigma). 

dCas9-KRAB mice were generated according to the method reported using a 
site-specific integrase-mediated approach as described”. In brief, transgenic mice 
dCas9-KRAB on the FVB background contain a CAG promoter within the Hipp11 
(H11) locus expressing the nuclease-deficient Cas9 fused to the zinc-finger pro- 
tein 10 (ZNF10) Kriippel-associated box (KRAB) repressor domain?*|, together 
with mCherry and the puromycin resistance cassette. dCas9-KRAB mice were 
backcrossed with wild-type C57BL/6] mice and subsequently crossed with gRNA- 
Slc25a44 transgenic mice to generate Slc25a44-KD mice. 

BAT-specific Slc25a44-KD (SIc25a44°47-KD) mice were generated by injecting 
adeno-associated virus (AAV) expressing gRNA-SIc25a44 (AAV8-CAG-eGFP- 
U6-gRNA-long tracr; custom order, Vector Biolabs) or control GFP (AAV8-CAG- 
eGFP) into interscapular BAT following the published protocol*”. In short, AAV 


was injected into the interscapular BAT of dCas9-KRAB adult mice at a viral titer 
of 6.0 x 10!! genomic copies (GC) per mouse. Fifty microlitres of AAV at a dose of 
1.2 x 10'° GC il! was injected in each BAT depot (5 il per injection, 10 locations 
per depot). Efficacy of viral infection and knockdown was evaluated by immuno- 
histochemistry for GFP and quantification of SLC25A44 expression level. 
Chemicals and antibodies. All chemicals were obtained from Sigma-Aldrich 
unless otherwise specified. The following antibodies were used in this study: 
UCP1 antibody (ab-10983, Abcam), BCAT1 antibody (TA504360, OriGene), 
BCAT2 antibody (9432, Cell Signaling Tech), BCKDHA antibody (sc-271538, 
Santa Cruz), TOM20 antibody (11802-1-AP, Proteintech), COX-IV antibody 
(4850, Cell Signaling), OXPHOS cocktail (Abcam, ab110413), PDH-E1a anti- 
body (sc-377092, Santa Cruz), PDH-Ela (pSer232) antibody (AP1063, Millipore), 
PDH-E1a (pSer293) antibody (ab177461, Abcam), PDH-Ela (pSer300) antibody 
(AP 1064, Millipore), GAPDH antibody (sc-32233, Santa Cruz) and (-actin anti- 
body (A3854, Sigma-Aldrich). Polyclonal antibody for SLC25A44 was generated 
by using the peptides (MEDKRNIQUEWEHLDKKKC, MMQRKGEKMGRFQVC 
and CKKLSLRPELVDSRH) as epitopes for immunization in rabbit (GeneScript). 
Cell culture. Brown adipocyte and beige adipocyte lines from C57BL/6 mice 
were established in our previous study*. Similarly, immortalized human brown 
adipocyte and white adipocyte lines were established previously’’. Mouse adipo- 
cyte differentiation was induced by treating confluent preadipocytes with DMEM 
containing 10% FBS, 0.5 mM isobutylmethylxanthine, 125 nM indomethacin, 
2 jg ml“! dexamethasone, 850 nM insulin, 1 nM T3 and 0.5 1M rosiglitazone. Two 
days after induction, cells were switched to maintenance medium containing 10% 
FBS, 850 nM insulin, 1 nM T3 and 0.5 .M rosiglitazone. Mouse cells were fully 
differentiated 6-7 days after inducing differentiation. Immortalized human brown 
preadipocytes were cultured with animal component-free medium (Stem Cell 
Technologies; #05449). Brown adipocyte differentiation was induced by treating 
confluent preadipocytes with animal component free adipogenic differentiation 
medium (Stem Cell Technologies; #05412) supplemented with T3 (1 nM) and 
rosiglitazone (0.5 ,.£M). Human cells were fully differentiated four weeks after 
induction. Mouse embryonic fibroblasts (MEF) were isolated from dCas9-KRAB 
mice and immortalized by infecting retrovirus expressing SV-Large T antigen. A 
mouse neuroblastoma line, Neuro2a (89121404, Sigma-Aldrich), was cultured in 
minimum essential medium Eagle (Sigma-Aldrich, M4655) containing 10% FBS, 
1% non-essential amino acid solution (Sigma-Aldrich, M7145) and 1% penicillin- 
streptomycin solution on collagen-coated plates. C2C12 cells were differentiated 
into myotubes by culturing confluent cells with DMEM supplemented with 2% 
FBS and 850 nM insulin. HEK293S cells were infected with retrovirus expressing 
the C-terminal Flag-tagged Slc25a44 or an empty vector and cultured in suspen- 
sion with a FreeStyle 293 Expression Medium (Thermo Fisher; 12338018) sup- 
plemented with 2% FBS. HEK293 and C2C12 cells were purchased from ATCC. 
No commonly misidentified cell line was used in this study. All the cell lines were 
routinely tested negative for mycoplasma contamination. 
Stable-isotope-labelled Leu metabolome analysis. To determine the metabolic 
fate and catabolic flux of Leu in brown adipocytes, we used [3g PN, ]Leu trac- 
ing followed by CE-TOFMS (Agilent Technologies). Differentiated human brown 
adipocytes were incubated in the BCAA-free medium supplemented with 2 mM 
[3C,, Ni ]Leu (608068, Sigma-Aldrich) and collected 1 h after the treatment 
with noradrenaline, washed twice with 10 ml of 5% mannitol aqueous solution, 
and subsequently incubated with 1 ml of methanol containing 25 ,.M internal 
standards (methionine sulfone, 2-(N-morpholino)-ethanesulfonic acid (MES) and 
p-camphor-10-sulfonic acid) for 10 min. Four hundred microlitres of the extracts 
were mixed with 200 jl Milli-Q water and 400 11 chloroform and centrifuged at 
10,000g¢ for 3 min at 4°C. Subsequently, 400 11 of the aqueous solution was centrif- 
ugally filtered through a 5-kDa cut-off filter (Human Metabolome Technologies) 
to remove proteins. The filtrate was centrifugally concentrated and dissolved in 50 
pl of Milli-Q water that contained reference compounds (200 |M each of 3-amino- 
pyrrolidine and trimesate) immediately before metabolome analysis. 

The concentrations of all the charged metabolites in samples were measured by 
CE-TOEFMS, following the methods as previously reported*. In brief, a fused silica 
capillary (50 j1m internal diameter x 100 cm) was used with 1 M formic acid as the 
electrolyte. Methanol:water (50% v/v) containing 0.1 |tM hexakis (2,2-difluoroeth- 
oxy) phosphazene was delivered as the sheath liquid at 10 jl min~'. Electrospray 
ionization (ESI)-TOFMS was performed in positive-ion mode, and the capil- 
lary voltage was set to 4 kV. Automatic recalibration of each acquired spectrum 
was achieved using the masses of the reference standards [('C isotopic ion of a 
protonated methanol dimer (2 MeOH + H)]*, m/z 66.0632) and ([hexakis (2,2- 
difluoroethoxy) phosphazene + H]*, m/z 622.0290). Quantification was per- 
formed by comparing peak areas to calibration curves generated using inter- 
nal standardization techniques with methionine sulfone. The other conditions 
were identical to those described previously**. To analyse anionic metabolites, a 
commercially available COSMO(+) (chemically coated with cationic polymer) 
capillary (50 jum internal diameter x 105 cm) (Nacalai Tesque) was used with a 


50 mM ammonium acetate solution (pH 8.5) as the electrolyte. Methanol-5 mM 
ammonium acetate (50% w/v) containing 0.1 {1M hexakis (2,2-difluoroethoxy) 
phosphazene was delivered as the sheath liquid at 10 jul min” '. ESI-TOFMS was 
performed in negative ion mode, and the capillary voltage was set to 3.5 kV. For 
anion analysis, trimesate and CAS were used as the reference and the internal 
standards, respectively. The other conditions were identical to those described 
previously*°. MPE of isotopes, an index of isotopic enrichment of metabolites, was 
calculated as the percent of all atoms within the metabolite pool that are labelled 
according to the established formula!*!” 

DNA constructs for overexpression and knockdown studies. The lentivi- 
ral expression plasmid that encodes mouse Sic25a44 open reading frame was 
obtained from GeneCopoeia (EX-Mm15289-Lv207-GS). The Slc25a44 sequence 
was amplified from the lentiviral plasmid by PCR and cloned in-frame with a 
Flag sequence into the retroviral expression vector (Addgene, #75085). Lentiviral 
shRNA expression constructs targeting mouse Slc25a44 and Slc25a39 (shSIc25a44, 
CS-MSH073484-LVRU6GH-01; shS/c25a39, MSH034465-LVRU6GH; scrambled 
control, CSHCTR001-LVRU6GH), lentiviral shRNA expression constructs tar- 
geting human SLC25A44 (shSLC25A 44, HSH057134-LVRH1H; scrambled con- 
trol, CSHCTRO01-LVRH1H), as well as lentiviral shRNA expression construct 
targeting mouse Ucp1 were obtained from GeneCopoeia (shUcp1, MSH028473- 
LVRH1MH). For virus production, HEK293T packaging cells were transfected 
with 10 ig of lentiviral or retroviral plasmids and the packaging constructs 
(VSVg, pMDL, and Rev) using a calcium phosphate method. After 48 h, the viral 
supernatant was collected and filtered. Immortalized preadipocytes, Neuro2a or 
HEK2935S cells were incubated overnight with the viral supernatant and supple- 
mented with 10 .g ml“! polybrene. Hygromycin at a dose of 50 or 200 jug ml! was 
used for selection of lentivirus-infected human cells and murine cells, respectively. 
Blasticidin at a dose of 10 jug ml! was used for selection of retrovirus-infected cells. 
Generation of Bckdha-KO and SIc25a44-KO brown adipocytes. For generation 
of Bckdha-KO brown adipocytes, preadipocytes isolated from BAT of Bckdhal/fox 
mice were immortalized by using the SV40 Large T antigen as described pre- 
viously’> and subsequently infected with retrovirus containing Cre (#34565, 
Addgene), followed by hygromycin selection at a dose of 200 jug ml“. For genera- 
tion of Slc25a44-KO brown adipocytes, immortalized brown adipocyte cell line was 
infected with lentivirus packaged by lentiCRISPRv2 (#98291, Addgene) express- 
ing Cas9 and gRNA for Slc25a44 (5'-GGTGCTCCCACTCGATGATC-3’). After 
selection with 200 jg ml! hygromycin followed by isolating a monoclonal cell, 
we confirmed homozygous mutations in the Slc25a44 genes by DNA sequencing. 
RNA preparation, quantitative RT-PCR and RNA-sequencing. Total RNA was 
extracted from tissue or cells using RNeasy mini-kit (Qiagen) and cDNA was syn- 
thesized using iScript CDNA Synthesis kit (BioRad) according to the provided 
protocols. RT-PCR was performed using an ABI ViiA7 PCR cycler. The primer 
sequences are listed in Supplementary Table 4. For RNA-sequencing, the libraries 
were constructed from total RNA and sequenced using a HiSeq 3000 instrument 
(Illumina) at the UCLA Technology Center for Genomics and Bioinformatics core 
by technical staff who were blinded to the experimental group. Sequenced tags were 
pseudo-aligned to mouse reference transcriptome. Transcript-levels estimated 
using Kallisto 0.44.0 were imported into R and expression levels per gene were 
estimated using the Bioconductor package tximport 1.10.0. 

BCAA oxidation assay. Differentiated adipocytes in a six-well plate were 
washed with PBS and incubated in 1 ml Krebs-Ringer modified buffer (KRB)- 
HEPES buffer, containing 2% BSA, 15 mM glucose, 200 nM adenosine, and 
either 0.16 juCi ml“! [1-'*C] Val together with 1 mM non-radioisotope (RI) Val 
or 0.16 jxCi ml“! [1-!*C]Leu together with non-RI 1 mM Leu, at 37°C for 2 h. 
Subsequently, 350 11 30% hydrogen peroxide was added in each well, and [4C]CO, 
was trapped in the smears supplemented with 300 il of 1 M benzethonium hydrox- 
ide solution at room temperature for 20 min. Similarly, isolated tissue (20-30 mg) 
was placed in a plypropylene round-bottom tube and incubated in the 1 ml KRB- 
HEPES buffer containing 0.16 Ci ml“! [1-4C]Val at 37°C for 1 h. After adding 
350 ul 30% hydrogen peroxide in the tube, [!*C]CO, was trapped in the centre well 
supplemented with 300 1l of 1 M benzethonium hydroxide solution for 20 min at 
room temperature. BCAA oxidation was quantified by counting radioactivity of 
trapped ['*C]CQ) using a scintillation counter. 

Mitochondrial amino acid uptake assay. Differentiated adipocytes in 10 cm cul- 
ture plates were washed in cold PBS and incubated with KPBS at 4°C for 10 min. 
Confluent Neuro2a cells were incubated with KPBS without washing in PBS to 
minimize cell loss. After removing KPBS, mitochondria were isolated by using a 
mitochondria isolation kit (Thermo Fisher; 89874) according to the provided pro- 
tocol. Isolated mitochondria were incubated with KRB-HEPES buffer, containing 
2% BSA, 15 mM glucose, 200 nM adenosine, and either 0.32 \tCi ml7! [U-4C] 
Val, [U-!4C]Leu, [U-MC] Ala, [U-!4C]Phe, [U-!4C] Thr, [U-C]Glu, [U-C] Asp, 
[U-!C]Lys, [U-MC] Arg (Moraveck), or [1-!4C]a-ketoisovalerate (American 
Radiolabelled Chemicals) at 37°C for 1 h. After cooling down on ice, mitochondria 
were washed in chilled PBS three times and homogenized in 100 1] RIPA buffer. 
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Mitochondrial amino acid uptake was quantified by counting radioactivity using 
a scintillation counter and normalized to protein content. 

Liposome preparation. Egg phosphatidylcholine (1.280 ml, 25 mg ml! in CHC], 
Avanti Polar Lipids, 840051), E.coli polar lipid (1.344 ml, 25 mg ml! in CHCh, Avanti 
Polar Lipids, 100600), and cardiolipin (0.640 ml, 10 mg m~! in CHCL, Sigma- 
Aldrich, C0563) were mixed in round-bottomed flask. The solvent was removed 
by rotary evaporation under vacuum at room temperature to form a lipid film, and 
further dried under strong vacuum for at least 2 h to remove trace CHC]. Four 
millilitres of 10 mM PIPES buffer pH7.4, which contains 25 mM non-radioisotope 
Leu and Glu as internal substrates, was gently added to the dried lipid film. The 
flask was kept overnight at 4°C to allow the formation of large unilamellar vesicles 
(LUVs), followed by incubating at 70°C for 30 min. The LUVs were extruded seven 
times through an extruder (Avanti Polar Lipids, 610000), which was assembled 
with two drain disks separated with a 1.0-j1m-pore polycarbonate membrane (GE 
Whatman, 889-78159). The extruded liposome was concentrated to 40 mg ml! 
lipid concentration in 10-kDa centrifugal filters (Millipore, UFC505024). 
Mitochondrial liposome assay. Mitochondria were isolated from differentiated 
Slc25a44-KO brown adipocytes stably expressing either Slc25a44 or an empty vec- 
tor (90 plates per group). The mitochondrial membrane was obtained by mechan- 
ical disruption and sonication. Sonicated mitochondrial membranes (2 mg ml!) 
were fused with liposome (4 mg ml) by incubating with 40 mM 8-p-octyl gluco- 
side (8-OG, Sigma-Aldrich, 08001) at 4°C for 1 h in PIPES buffer containing non- 
radioactive Leu and Glu. After removal of 8-OG by Bio-Beads SM-2 (Bio-Rad), 
mitochondrial liposomes were isolated on Sepharose 4B columns (Sigma-Aldrich, 
4B-200) to remove the external substrates. Mitochondrial liposomes were trapped 
on 10-kDa centrifugal filters (Millipore, UFC505024), eluted in 1200 jl PIPES 
buffer without non-radioactive Leu or Glu, and then used for uptake assays. 
Transport of [!4C,]Leu or [!4C;]Glu was initiated by incubating mitochondrial 
liposomes with either 20 1M [!*C¢]Leu or 20 uM [!*Cs]Glu at 37°C and stopped 
by filtering the reaction mixture with a vacuum manifold (0.45-\1m pore size) at 
the indicated time points. Following six washes with 600 1] ice-cold PIPES buffer, 
uptake was quantified with a scintillation counter. 

Proteoliposome assay. HEK293S cells stably expressing C-terminal Flag-tagged 
Slc25a44 were cultured in 9 1 suspension medium, collected and disrupted with a 
Dounce homogenizer in solubilization buffer (20 mM Tris-HCl, 100 mM NaCl, 
10% glycerol, 1% DDM with 0.1% cholesteryl hemisuccinate (CHS, Anatrace, 
D310-CH210), EDTA-free protease inhibitor (Roche)), followed by solubilization 
at 4°C for 1 h. After ultracentrifugation at 200,000g for 20 min, the supernatant 
was incubated with Flag M2 affinity gel (Sigma-Aldrich, A2220) at 4°C for 2 h. The 
immunoprecipitates were washed five times with washing buffer (20 mM Tris-HCl, 
500 mM NaCl, 10% glycerol, 0.1% DDM with 0.01% CHS), followed by competitive 
elution using Flag peptide (Sigma-Aldrich, F4799) in SEC buffer (20mM Tris-HCl, 
100 mM NaCl, 10% glycerol, 0.1% DDM with 0.01% CHS). Purified SLC25A44- 
Flag (56 j1g) was fused with liposome (8 mg) by incubating at 4°C for 1 h in 2 ml 
PIPES buffer containing 25 mM non-radioactive Leu and Glu in the presence of 
40 mM 8-OG. Following removal of 8-OG by Bio-Beads SM-2, proteoliposome was 
isolated on Sepharose 4B columns to remove the external substrates. Subsequently, 
the proteoliposomes were trapped on 10-kDa centrifugal filters, eluted in 1200 1l 
PIPES buffer without non-radioactive Leu or Glu, and then used for uptake assays. 
Transport of ['4C,]Leu was initiated by incubating proteoliposomes with 20 tM 
[4C.]Leu at 37°C and stopped by filtering the reaction mixture with vacuum 
manifold at the indicated time points. Following six washes with 600 \1l ice-cold 
PIPES buffer, uptake was quantified by a scintillation counter. 

Temperature recording. For core-body temperature recording experiments, rectal 
temperature of BckdhaY©?!-KO and Slc25a44-KD mice was monitored using a 
TH-5 thermometer (Physitemp) up to 14h after cold exposure. For tissue temper- 
ature recording, mice under anaesthesia were implanted with type T thermocouple 
probes in the interscapular BAT, inguinal WAT, liver, and skeletal muscle, according 
to the method that was described previously**. Tissue temperature was recorded 
by TC-2000 Meter (Sable Systems International). When tissue temperature was 
stable, mice were intraperitoneally administered noradrenaline at a dose of 1 mg 
per kg (body weight) to induce non-shivering thermogenesis. 
Electromyography. Skeletal muscle shivering was assessed by using electromy- 
ography (EMG) recording, as reported in our previous study’. In brief, mice were 
placed in a restrainer to limit free movement, and 29-gauge needle electrodes were 
placed the back muscles of mice. The EMG signal was processed (low-pass filter, 
3 kHz; high-pass filter,10 Hz; notch filter, 60 Hz) and amplified 1,000 x with Bio 
Amp (ADInstruments). EMG data were collected from the implanted electrodes 
at a sampling rate of 2 kHz using LabChart 8 Pro Software (ADInstruments). The 
raw signal was converted to root mean square (RMS) activity. RMS activity was 
analysed for shivering bursts in 10-s windows. For monitoring muscle shivering 
in humans, EMG at the pectoral muscle was recorded by using a surface EMG 
(Polymate II; TEAC). EMG was recorded for 10 min at 27 °C before cold exposure, 
and for another 10 min at 19°C during the 2 h cold exposure. 
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18F_Fluciclovine-PET-CT scan. !°F-Fluciclovine (100 Ci) was administered to 
male wild-type mice (C57B16/J) at 14-15 weeks of age via a tail-vein injection 
under 2% isoflurane anaesthesia after 6 h fasting. Mice were acclimatized to either 
30°C or cold temperature at 15°C for 2 weeks. !8F-Fluciclovine-uptake (SUV) was 
measured every minute immediately after tail vein injection using micro-PET-CT 
imaging system at the UCSF PET-CT Imaging Core Facility. Changes in SUV 
were quantified starting from the first 60 s after *F-Fluciclovine injection by using 
software AMIDE 1.0.4 (Amide). 

Oxygen consumption assays. OCR in cultured adipocytes was measured using the 
Seahorse XFe Extracellular Flux Analyzer (Agilent) in a 24-well plate. For measure- 
ment of noradrenaline-induced respiration in the presence and absence of BCAA, 
differentiated adipocytes were maintained in KRB-HEPES buffer containing 
15 mM glucose, 200 nM adenosine, and 2% BSA. During OCR measurement, cells 
were treated with 2 mM BCAA (Val, Leu, or Ile), 2 mM KIV, 10 mM succinate, 
or vehicle, and subsequently treated with noradrenaline (1 |1M) at the indicated 
time point. For the mitochondrial stress test, differentiated brown adipocytes in 
a 24-well plate were pretreated with 300 1M clofibrate, a BCAT2 activator, and 
subjected to respiratory assay. During OCR measurement, cells were treated with 
oligomycin (5 1M), carbonyl cyanide 4-(trifluoromethoxy) phenylhydrazone 
(FCCP, 5 1M) and antimycin (5 1M). 

Mitochondrial electron transport activity. Mitochondrial electron transport 
(ETC) activity was assessed as reported previously**. In brief, mitochondria were 
isolated from BAT of mice using a Comitial Kit (Ab110168, Abcam) and were 
resuspended in 300 11 of isolation buffer provided by the kit. After protein quan- 
tification by the BCA method, the mitochondrial suspension was diluted with 
isolation buffer at concentration 0.1 mg ml~!, seeded into a 24-well plate (5 jug per 
50 il per well), and adhered to the bottom of the plate by centrifugation 2,000g at 
4°C for 20 min using microplate rotor adaptor. Immediately before the measure- 
ment, 450 1l mitochondrial assay buffer and substrates supplemented with 10 mM 
pyruvate, 5 mM malate, 50 mM KCl, 4mM KH2PO,, 5 mM HEPES, 1 mM EGTA 
and 4% fatty-acid-free BSA was added to each well. During OCR measurement by 
the Seahorse XFe Extracellular Flux Analyzer, mitochondria were treated with 2 uM 
rotenone, 10 mM succinate, 5 {1M antimycin A and 100 .M N,N,N’,N’-tetramethyl- 
p-phenylenediamine (TMPD) with 10 mM ascorbate at the indicated time points. 
PDH activity and BCKDH activity assays. Tissue lysate was prepared by homoge- 
nizing BAT in ice-cold PBS buffer containing cOmplete Protease Inhibitor Cocktail 
(Roche) and 5 mM NaF. Two hundred micrograms of BAT lysates were applied 
to measure PDH enzymatic activities by a commercially available kit (Abcam, 
ab109902). The BCKDH activity measurement was performed as previously 
described*”. 

Glucose oxidation assay. Differentiated adipocytes in a six-well plate were incu- 
bated in DMEM containing 2% FBS for 2 h. After washing in PBS, cells were incu- 
bated in 1 ml of KRB-HEPES buffer containing 2% BSA, 15 mM glucose, 200 nM 
adenosine, and 0.5 Ci ml~! [1-'4C] glucose, supplemented with or without 
1 mM Val, at 37°C for 2 h. Subsequently, 350 11 30% hydrogen peroxide was added 
in each well, and ['4C]CO, was trapped in the smears supplemented with 300 il 
1M benzethonium hydroxide solution at room temperature for 20 min. For the 
assay in tissues, mice were fasted for 6 h and euthanized. Isolated tissue (20-30 mg) 
was placed in a plypropylene round-bottom tube and incubated in the 1 ml KRB/ 
HEPES buffer containing 1.0 Ci ml“! [1-'4C] glucose at 37 °C for 1 h. After adding 
350 11 30% hydrogen peroxide in the tube, ['*C]CO, was trapped in the centre well 
supplemented with 300 j1l of 1 M benzethonium hydroxide solution for 20 min at 
room temperature. Glucose oxidation was quantified by counting radioactivity of 
trapped [!*C]CO, using a scintillation counter. 

Fatty acid oxidation assay. Differentiated adipocytes were plated in a six-well plate 
and incubated in medium containing 2% FBS for 4_h. After washing in PBS, the 
cells were incubated in 1 ml of KRB-HEPES buffer, containing 15 mM glucose, 
0.1 mM oleic acid, and 0.5 juCi ml [1-"4C]oleic acid bound to 2% BSA and 100 {1M 
carnitine, supplemented with or without 1 mM Val, for 2 h at 37°C. Then, 350 jl 
30% hydrogen peroxide was added in each well to trap [!4C]CO, in the smears 
supplemented with 300 il of 1 M benzethonium hydroxide solution. For the assay 
in tissues, mice were fasted for 4 h and euthanized. Isolated tissue (20-30 mg) was 
placed in a polypropylene round-bottom tube and incubated in the 1 ml KRB- 
HEPES buffer containing 1.0 jvCi ml“! [1-!C]oleic acid at 37°C for 1 h. After 
adding 350 jl 30% hydrogen peroxide in the tube, ["*C]CO, was trapped in the 
centre well supplemented with 300 \1l of 1 M benzethonium hydroxide solution 
for 20 min at room temperature. Oleic acid oxidation was quantified by counting 
radioactivity of trapped [!*C]CO, using a scintillation counter. 
Immunoblotting. Protein lysates from isolated tissues or cultured cells were 
extracted using Qiagen TissueLyzer LT and RIPA lysis and extraction buffer 
(Thermo Fisher) and cOmplete protease inhibitors (Roche). Tissue lysates were 
applied to immunoblot analysis using the UCP1 antibody (1:2,000), BCAT1 anti- 
body (1:1,000), BCAT2 antibody (1:1,000), BCKDHA antibody (1:2,000), TOM20 
antibody (1:2,000), COX-IV antibody (1:2,000), OXPHOS cocktail (1:2,000), 


PDH-Ela antibody (1:1,000), PDH-Ela (pSer232) antibody (1:1,000), PDH- 
Ela (pSer293) antibody (1:1,000), PDH-Ela (pSer300) antibody (1:1,000), and 
SLC25A44 antibody (1:1,000). 8-actin (1:10,000) and GAPDH (1:2,000) were used 
as a loading control for each sample. 

Tissue histology and immunostaining. For H&E staining, tissues of mice were 
fixed in 4% paraformaldehyde overnight at 4°C, followed by dehydration in 70% 
ethanol. After the dehydration procedure, tissues were embedded in paraffin, 
sectioned at a thickness of 5 1m, and stained with H&E following the standard 
protocol. For immunostaining, paraffin-embedded tissues were deparaffinized 
twice in xylene and subsequently rehydrated. After incubating the slides for 
20 min in boiling water, the tissues were blocked in PBS containing 2% BSA for 
60 min. After washing in PBS, the slides were incubated with the primary antibody 
(chicken anti-mouse GFP, 1:200) overnight at 4°C, followed by incubation with the 
fluorescence-conjugated second antibody (goat anti-chicken IgG Alexa Fluor 488 
green, 1:500) for 1 h at room temperature. After washing, the sections were stained 
with DAPI and mounted with mounting medium (Cytoseal 60, Thermo-Scientific). 
Images of tissue samples were captured using the Inverted Microscope Leica DMi8. 
Statistical analyses. All data were expressed as mean + s.e.m. and analysed with 
statistical software (SPSS 25.0; IBM). The sample size was determined by the 
power analysis with a = 0.05 and power of 0.8, and based on our experience 
with experimental models, anticipated biological variables and previous studies. 
The metabolite analyses in human sera and mouse plasma, the [3C,, SN!]Leu 
tracing in human brown adipocytes, the PET-CT examination using '*F-FDG 
(in humans) or !*F-fluciclovine (in mice), and GTT and ITT in mice fed high- 
fat diet were performed by researchers who were blinded to the experimental 
groups. RNA sequencing and library constructions were performed by technical 
staff at the UCLA genome core who were blinded to the experimental groups. 
RNA sequencing alignment were performed by researchers who were blinded to 
the experimental groups. Blinding was not relevant to the other experiments in 
mice or cells because mice or cells had to be genotyped by PCR. Comparisons 
between the two groups were analysed using the paired t-test or the Student's t-test, 
as appropriate. One-way or two-way ANOVA followed by Tukey’s post hoc test 
or post hoc paired/unpaired t-tests with Bonferroni’s correction was used for mul- 
tiple group comparisons. One-way or two-way repeated measures ANOVA was 
used for the comparisons of repeated measurements. Pearson's and Spearman's 
correlation coefficients were used to determine normally distributed variables 
and non-normally distributed variables, respectively. One-tailed paired t-test was 
used to analyse (RT-PCR validation of human BAT biopsy RNA-seq data. For 
all other experiments, two-tailed P value was calculated; P<0.05 was considered 
statistically significant. 

Reporting summary. Further information on research design is available in the 
Nature Research Reporting Summary linked to this paper. 
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Extended Data Fig. 1 | Cold-induced changes in circulating metabolites 
in mice and humans. a, Representative EMG in adult humans at 27°C 
and following cold exposure at 19°C for 2 h. Voluntary muscle contraction 
as a positive control of EMG recording. b, c, Serum non-esterified fatty 
acids (NEFA) (b) and blood glucose (c) levels in high- (n = 9) and low- 
BAT subjects (n = 6) at thermoneutral 27 °C (TN) and following cold 
exposure at 19°C. d, Correlation between BAT activity (SUV, logio) and 
cold-induced changes in serum amino acid levels of high- (red dots) and 
low-BAT subjects (blue dots). n = 33 per group (all amino acids) except 


n= 29 (Asp). e, Correlation between fat-free mass (kg) and changes in 
serum total BCAAs in d. n = 33. f, Changes in plasma BCAA levels at 
thermoneutral (30°C) or cold exposure (15°C) in diet-induced obese 
mice. n = 8 (TN), n = 7 (cold). b-f, Biologically independent samples. 
Data are mean + s.e.m.; two-sided P values by paired t-test (b, c) or two- 
way repeated-measures ANOVA followed by post hoc paired or unpaired 
t-tests with Bonferroni’s correction (f). Pearson's (r) or Spearman's rank 
correlation coefficient (rs) was calculated, as appropriate (d, e). 


ARTICLE 


a Ing-WAT Muscle Brain Liver Heart 
> 1.04 _¢- Thermoneutral 1-9 7 1.0 aaa 97 
a —@— Cold exposure P=0.90 
oO ro 
=< P=0.014 P=0.80 P=0.47 64 6 + 
$ 0.5 
oO 
§ 3 34 P=0.82 
2 
3 
12) 
= 
2 0 10 20 0 10 20 0 10 20 
> 
Time after '®F-Fluciclovine i.v. injection (min) 
b Val oxidation per mg tissue Cc Total Val oxidation in the tissue depot d Val oxidation (human adipocytes) 
P=4x 10" 
44 Hl Room temp 4 IB Room temp 2.0 P=1x108 i veh 
° WM cold BS 007 Hl cold Wm NA 
3 | P=0.043 


Relative [1-14C] Val oxidation 
(cpm mg tissue) 
NO 
Relative [1-'4C] Val oxidation in 
depot (cpm mg” x tissue mass) 
Relative [1-'4C] Val oxidation 
(cpm ug protein-') 


BAT _Ing- Epi- Muscle BAT _Ing- Epi- Muscle Brown White 
WAT WAT WAT WAT 
Human adipocytes Mouse adipose tissue Mouse adipose tissue f 
62.8% (27/43 genes) 75.6% (34/45 genes) 60.0% (27/45 genes) Mouse BAT 
enriched in brown enriched in BAT enriched in beige fat 


TN Cold P-value 


Cytosol Cytosol Cytosol 
Leu lle Val g Leu lle Val g Leu lle Val & UcP1 8x10% 
BCAT2 0.717 


\s | eae ee a 
BCAT1 Beat1 Beat1 
| BAT! | Transporter X Transporter X Transporter X 
es a 


LE? BCKDHA 0.002 
Mitochondria Mitochondria Mitochondria Ia BCKDHB 0.058 
a-Ketoacids seas oitetoacics DLD 0.144 
BCKDH BCKDH DBT 0.006 
oe ee SLC25A20 | axt04 


22/40 enzymes SLC25A22 0.020 


SLC25A39 0.247 
Acetyl-CoA — Succinyl-CoA Acetyl-CoA = Succinyl-CoA Acetyl-CoA —Succinyl-CoA 
\ \ 4 NX 4 SLC25A44 k 0.004 
Z-scored mRNA levels 
Uae aan (relative to white) TCA 
Cl = — cycle Protein abundance _— | Z-score 
-0.7 0 0.7 -0.7 0 0.7 
. 
g Human adipose tissue h | 
rae 
Glucose oxidation BCAA oxidation 30 L =e 3 tte 
1 1 Qo 8B BG 
Ba 3 >. © 
WAT BAT WAT BAT - as 282226 
TN Cold TN Cold TN Cold TN Cold 
HK2 20 
IDH3A 


IDH3B 


IDH3G BCKDHA 10 
LDHB BCKDHB 
LDHD 5 
PGM1 


mRNA transcript (FPKM) 
a 


PKM HSD17810 
PDHA1 


MCCC1 Beat? Beat2 
PDHB 


PDHX Zsore EE aw 
40. 4 HE Brown Mi Beige [_] White 


Extended Data Fig. 2 | See next page for caption. 
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Extended Data Fig. 2 | The BCAA catabolic pathway in human 

and mouse adipose tissues. a, '*F-Fluciclovine-uptake into indicated 
organs determined by dynamic PET scanning. n = 5 per group. b, Val 
oxidation (per mg tissue) in indicated tissues of mice acclimatized to 
23°C or 12°C for one week. n = 5 per group. c, Total Val oxidation in 
(b). Total Val oxidation was calculated by multiplying Val oxidation per 
mg tissue (cpm per mg tissue) and tissue mass of the depot (mg). d, Val 
oxidation normalized to total protein (jg) in human brown adipocytes 
and white adipocytes following 2-h treatment with noradrenaline or 
vehicle. n = 5 (Veh), m = 6 (noradrenaline). e, Expression profile of 
BCAA catabolic enzymes enriched in brown and beige fat relative 

to white fat of humans (left) and mouse (middle, right). Data were 
obtained from a previous RNA-seq dataset in humans’ and a microarray 
dataset in mice'”. The profiles were mapped onto the KEGG BCAA 
catabolic pathway. The number of brown and beige-enriched enzymes 
among total BCAA catabolic enzymes is shown. n = 3 per group. 

f, Proteomic profile of indicated enzymes in the BCAA oxidation 


pathway and mitochondrial carriers (SLC25A families) in interscapular 
BAT of mice at thermoneutrality (29°C) or 5°C for 3 weeks!®. n = 4 

per group. g, Transcriptional profile of indicated genes in the glucose 
oxidation pathway (left) and the BCAA oxidation pathway (right) in the 
supraclavicular BAT and abdominal WAT from the identical subject under 
a thermoneutral condition (27 °C) and after cold exposure at 19°C (ref. °). 
The colour scale represents Z-scored FPKM (fragments per kilobase of 
exon per million fragments mapped). h, mRNA expression level (FPKM) 
of Bcat1 and Bcat2 in differentiated brown adipocytes, beige adipocytes 
and white adipocytes. The transcriptome data are from a previous RNA- 
seq dataset’>. I, Immunoblotting of BCAT1 and BCAT2 in indicated 
tissues of mice kept at ambient temperature. GAPDH as a loading control. 
Representative result from two independent experiments. Gel source data 
are in Supplementary Fig. 1. a—-h, biologically independent samples. Data 
are mean + s.e.m.; two-sided P values by unpaired Student's t-test 

(b, c, f), two-way repeated measures ANOVA (a), or two-way factorial 
ANOVA followed by Tukey’s post hoc test (d). 
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Extended Data Fig. 3 | Characterization of BAT-specific Bckdha-KO 
mice. a, mRNA expression of Bckdha in BAT of BckdhaY!-KO and 
littermate control mice. n = 5 per group for all groups except n = 3 for 
control-gastrocnemius. b, Val oxidation normalized to tissue weight 

(in mg) in indicated tissues of mice in a. n = 4 per group. c, Enzymatic 
activity of BCKDH complex (KIV oxidation) in BAT of control and 
Bckdha"!_KO mice acclimatized to 23°C (n = 3 per group) or 12°C 
(control n = 5, KO n = 6) for one week. d, Tissue weights of mice in 

(a) on a normal chow at ambient temperature. n = 4 per group. e, mRNA 
expression of indicated genes in BAT of mice in a. n = 5 per group. 

f, EMG of muscle shivering in control (n = 7) and BckdhaY?!-KO mice 


Time in cold (h) 


(n = 9) at 30°C or 8°C. The right graph shows quantitative root mean 
square (RMS) of EMG. g, Liver temperature of control and BckdhaY“!-KO 
mice following noradrenaline treatment. n = 4 per group. h, Plasma 
amino acid levels after 3 h BCAA oral gavage. n = 5 per group. i, Plasma 
BCAA concentration of control (n = 7) and BckdhaY?!-KO mice (n = 9) 
following cold exposure at 8 °C. a-i, Biologically independent samples. 
Data are mean + s.e.m.; two-sided P values by unpaired Student's t-test 

(a, b, d, e, h), two-way factorial ANOVA followed by Tukey’s post hoc test 
(c), or two-way repeated measures ANOVA (f, g, i) followed by post hoc 
paired or unpaired t-tests with Bonferroni's correction (f, i). 
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Extended Data Fig. 4 | See next page for caption. 
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Extended Data Fig. 4 | The effect of noradrenaline on BCAA 
metabolism in brown adipocytes. a, Scheme of the metabolic tracer 
experiment in human brown adipocytes. Cells were treated with vehicle or 
noradrenaline for 1 h in the presence of [13C¢, '°N,]Leu. b, Isotopologue 
distributions of TCA intermediates from [!3C,, !9N,]Leu in a. n = 6 per 
group. c, Protein expression of indicated BCAA catabolic enzymes at 
indicated time points of cold acclimatization. The expression profile is 
analysed in the proteomics dataset'*®. n = 4 (TN, cold 3 weeks), n = 3 (cold 
8 h, 1 day, 3 days, 1 week). d. The BCAA catabolic pathway that indicates 
Val and Leu catabolic enzymes. Enzymes whose protein expression was 
transiently upregulated by acute cold exposure were highlighted in red 

on the basis of the results in c. Enzymes whose protein expression was 
gradually upregulated following chronic cold adaptation are highlighted 
in blue. e, OCR normalized to total protein (in 1g) in human brown 
adipocytes. Differentiated adipocytes in the BCAA-free medium were 
supplemented with Val or vehicle, and subsequently stimulated with 
noradrenaline. n = 10 per group. f, Schematics of the mitochondrial Val 
catabolic pathway. Vanadate and malonate inhibit succinyl coenzyme A 
synthetase and succinate dehydrogenase, respectively. g, Noradrenaline- 
induced OCR in the presence and absence of Val in mouse brown 
adipocytes. Following pretreatment with vanadate (50 \1M) or malonate 

(5 mM), differentiated cells in the BCAA-free medium were supplemented 
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with Val or vehicle, and subsequently treated with noradrenaline. n = 9 
(vehicle), n = 8 (Val), n = 4 (vehicle + vanadate, Val + vanadate), n = 5 
(vehicle + malonate, Val + malonate). h, Noradrenaline-induced OCR in 
the presence and absence of BCAAs in mouse brown and white adipocytes. 
Differentiated cells were supplemented with indicated amino acids, 

and subsequently treated with 1 1M noradrenaline. Brown adipocytes: 
n= 10 (Val—, Val-+, Ile+), 9 (Leu—), 5 (Leu+) and 11 (Ile—). White 
adipocytes: n = 9 (Val—) and 10 (Val+). i, Noradrenaline-induced OCR 
in the presence and absence of Val in wild-type, Ucp1-KO and Bckdha-KO 
brown adipocytes. Bckdha-KO brown adipocytes were treated with 2 mM 
KIV, 10 mM succinate or vehicle before noradrenaline stimulation. Wild 
type: n = 10 (Val—) and 9 (Val+-). Ucp1-KO: n = 10 (Val—, Val+-). Bckdha 
KO: n = 7 (Val+), 9 (Val+; KIV+) and 10 (Val+; succinate+).j, OCR 
normalized to total protein (j1M) in wild-type (left) and Bcdkha-KO 
brown adipocytes (right). Differentiated adipocytes were pretreated with 
BCAT2 activator, clofibrate (300 1M), or vehicle. Following measurement 
of basal OCR, cells were treated with oligomycin (5 1M), FCCP (5 1M), 
and antimycin A (AA, 5 11M). Wild type: n = 5 per group. Bckdha KO: 

n =7 per group. b, c, e, g-j, Biologically independent samples. Data are 
mean + s.e.m.; two-sided P values by unpaired Student's t-test (b, g, h), 
one-way factorial ANOVA followed by Tukey’s post hoc test (i) or two-way 
repeated measures ANOVA (e, j). 


ARTICLE 


a 


—O-— Control 


—@— Bekdha’’P' KO 


b 


P=0.011 
250 4 30 4 
P=0.27 
— 2 ~ 
S) => ° >) 
a fe) ~~ 
® — 20 4 7) 
x wn 8 2) 
g n © 
= 2 € 
3 = 19 4 c 
3} © 7 
im ue per 
0 
0 2 4 6 8 10 > 
&& 
Time (week) o $ 
& 
x 
& 
d [1 Thermoneutral 
e HI Cold exposure 
P=0.003 pone 
B 2504 cone 
=| 00 og 
a Oa 
2 200 5 ° Qe 
ie 
fe) - = 
> ae 
E Pe 
. ag 
g 25 
© = 
: s% 
x= ing x 


Control Bekdha’°P! KO 


| 50 kDa 
ad 


pSer232 
pSer293 
pSer300 
PDH E1a 
GAPDH 


nme) 20 kDa 
--- -F-e2 


50 kDa 
-- ee ee 


| 50 kDa 
eeeee- @ @ 


os 
re) 
a= 
Zo 
o°0 
ao 
a* 
oS 
Ns 
® & 
no 
at 


rt 37 kDa 


Extended Data Fig. 5 | Metabolic characterization of Bckdha’!-KO 
mice. a, Cumulative food intake of BckdhaY“?!-KO mice (n = 15) and 
littermate controls (n = 13) on high-fat diet. b, Fat mass and lean mass 

of mice in a at 10 weeks of high-fat diet. c, Tissue weights of mice in 

a. d, Triglyceride (TG) content in the liver of mice in a. n = 8 per group. 

e, Oleic acid oxidation normalized to tissue mass (mg) in the interscapular 
BAT of mice acclimatized to thermoneutral 30°C or cold exposure at 
12°C. n = 4 per group. f, PDH activity in the inguinal WAT, gastrocnemius 
muscle and liver of Bckdha’@?!-KO mice and littermate controls that 

were exposed to cold at 12°C for 1 week. Inguinal WAT (Ing-WAT): 
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n= 5 (control) and 6 (Bckdha"©?!-KO). Gastrocnemius, liver: n = 4 per 
group. g, Immunoblotting for PDH-Ela(pSer232), PDH-Ela(pSer293), 
PDH-E1a(pSer300), and total PDH-E1a in the BAT of the control and 
Bekdha"?!-KO mice. GAPDH as a loading control. n = 4 per group. 
Uncropped immunoblot images of are available in Supplementary Fig. 1. 
h, Quantification of phosphorylated PDH-Ela normalized to total PDH- 
Ela protein level in g. a-h, Biologically independent samples. Data are 
mean + s.e.m.; two-sided P values by unpaired Student's t-test (b-d, f, h), 
two-way repeated measures ANOVA (a) or two-way factorial ANOVA 
followed by Tukey’s post hoc test (e). 
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Extended Data Fig. 6 | See next page for caption. 
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Extended Data Fig. 6 | Characterization of SLC25A44 in thermogenic 
adipocytes. a, Expression profile of Slc25a family members in the inguinal 
WAT of mice acclimatized to 23°C or 12°C for 1 week. n = 3 per group. 

b, mRNA expression of UCP1, SLC25A44 and SLC25A39 normalized to 
TBP levels in the supraclavicular BAT from the same individuals (six pairs) 
at thermoneutrality (27 °C) and cold temperature (19°C). ¢, Mitochondrial 
localization of SLC25A44 protein in differentiated mouse beige adipocytes. 
TOM20 was used as a mitochondrial marker. d, Immunoblotting for 
SLC25A44 in BAT and liver of control and Slc25a44-KD mice. GAPDH 
was used as a loading control. Red arrows indicate specific bands whose 
intensities were decreased in Slc25a44-KD mice. e, mRNA expression of 


Slc25a44 and indicated genes normalized to levels of 36B4 (also known 
as Rplp0) during mouse brown adipogenesis. n = 4 per group. f, Protein 
expression of SLC25A44 in mouse beige preadipocytes and differentiated 
adipocytes. (}-actin was used as a loading control. g, Protein expression 
of UCP1 and SLC25A44 in immortalized human brown preadipocytes 
and differentiated adipocytes. (-actin was used as a loading control. 

a, b, e, Biologically independent samples. Data are mean + s.e.m.; one- 
sided P values by paired t-test (b) and two-sided P values by unpaired 
Student’s t-test (a). c, d, f, g, Representative results from two independent 
experiments. Uncropped images are available in Supplementary Fig. 1. 
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Extended Data Fig. 7 | See next page for caption. 
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Extended Data Fig. 7 | Biochemical characterization of SLC25A44. 

a, Genomic Slc25a44 sequence of Slc25a44-KO brown cell line (upper 
panel). Predicted amino acid sequence of SLC25A44 is shown in lower 
panel. Homozygous mutation in the Slc25a44 gene by CRISPR-Cas9 
results in a premature stop codon in KO cells. b, Scheme of mitochondrial 
BCAA uptake assay. Isolated mitochondria from differentiated brown 
adipocytes were incubated with [U-!*Cs]Val. Mitochondrial uptake was 
quantified by a scintillation counter. c, Validation of mitochondrial Val 
uptake assay in differentiated brown adipocytes. Note that addition of 
excess non-labelled Val (20 mM) abolished [U-!4Cs] Val uptake into 

the mitochondria. d, mRNA expression of Slc25a44 and Slc25a39 in 
differentiated mouse brown adipocytes expressing a scrambled control 
shRNA (Scr, n = 6) and shRNAs targeting S!c25a44 (shRNA #1, #2, n = 
4 per group), Slc25a39 (n = 4) or both Sic25a44 shRNA #1 and Slc25a39 
shRNA (double knockdown, n = 5). e, Mitochondrial uptake of [U-'4Cs] 
Val (left) and [U-'4C.]Leu (right) in brown adipocytes in (d). n = 3 per 
group. f, mRNA and protein expression of Slc25a44 in mitochondria of 
Neuro2a cells expressing an empty vector or Slc25a44. COX-IV was used 
as a loading control. n = 3 per group. g, Immunoblotting for SLC25A44 


in the isolated mitochondria from differentiated Slc25a44-KO brown 
adipocytes expressing an empty vector or Slc25a44. TOM20 was used as 
a loading control. h, Immunoblotting of SLC25A44 in the mitochondria- 
fused liposome. Mitochondrial membrane isolated from Slc25a44-KO 
brown adipocytes expressing an empty vector or Slc25a44 was fused 

with liposome. TOM20 was used as a loading control. i, [U-*C,]Leu 
uptake rate in the liposome in h. n = 3 per group. j, [U-'*Cs]Glu uptake 
rate in the liposome in h. n = 3 per group. k, Coomassie blue staining of 
purified SLC25A44 protein from HEK293S cells overexpressing Slc25a44. 
1, Immunoblotting of SLC25A44 in liposomes reconstituted with purified 
SLC25A44 (proteoliposome) and liposomes reconstituted without 
SLC25A44 (empty liposome). m, Left, [U-'4C,]Leu transport 

into proteoliposomes in 1. Right, Leu uptake rate. n = 3 per group. 

d-f, Biologically independent samples. i, j, m, Technically independent 
samples. f-m, Representative result from two independent experiments. 
Data are mean + s.e.m.; two-sided P values by unpaired Student's t-test 
(f, i, j, m) or one-way ANOVA followed by Tukey’s post hoc test 

(d, e). f-h, k, 1, Uncropped images are available in Supplementary Fig. 1. 
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Extended Data Fig. 8 | Generation of SIc25a44°47-KD mice. a, DNA 
constructs used in the generation of dCas9-KRAB mice. The dCas9-KRAB 
construct was inserted into the Hipp11 (H11) gene locus by the site- 
specific PhiC31 integrase. b, Experimental procedure of gRNA screening. 
MEFs from dCas9-KRAB mice were used to identify gRNA that effectively 
deplete Slc25a44. Graph shows Slc25a44-knockdown efficiency for six 
independent gRNAs in the dCas9-KRAB-derived MEFs (n = 2 per group). 
gRNA-SIc25a44 #1 (indicated by a red arrow) was used for generation 

of gRNA Tg mouse. c, Schematics of BAT-specific Slc25a44-KD mice 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | Characterization of Slc25a44-KD mice. 

a, Generation of Slc25a44-KD mice by the dCas9-KRAB system. The 
dCas9-KRAB mouse was crossed with transgenic mouse expressing gRNA 
targeting Slc25a44 to generate SLC25A44 deficient mice. b, Slc25a44 
mRNA expression normalized to 36B4 levels and protein expression 

in the BAT of mice in a. 3-actin was used as a loading control. n = 5 
(control) and 4 (Slc25a44-KD). c, Expression profile of Slc25a family 
members in BAT in a by RNA-seq analysis. The colour scale shows log»- 
transformed fold change in TPM (Slc25a44-KD versus control). n = 3 per 
group. d, mRNA expression of Slc25a families normalized to 36B4 levels 
in Slc25a44-KO and control brown adipocytes. n = 6 per group. e, H&E 
staining of BAT, inguinal WAT, liver and gastrocnemius muscle from mice 
in a. f, Triglyceride content in the interscapular BAT of Slc25a44-KD and 
control mice. n = 4 per group. g, Expression profile of fatty acid synthesis- 
and oxidation-related genes in BAT of mice in a by RNA-seq analysis. 
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The colour scale shows log-transformed fold change in TPM (SIc25a44- 
KD versus control). n = 3 per group. h, Oleic acid oxidation normalized 
to tissue mass (mg) in BAT of Slc25a44-KD and control mice acclimatized 
to thermoneutral 30°C or cold temperature (12 °C) for one week. n = 4 
per group. i, EMG measurement of muscle shivering in control mice and 
Slc25a44-KD mice at 30°C or 8°C. The lower graph shows the RMS of the 
EMG. n = 6 per group. j, Tissue temperature in indicated tissues of control 
and Slc25a44-KD mice following noradrenaline treatment (indicated by 
red arrows). n = 4 per group. b-d, f-j, Biologically independent samples. 
Data are mean + s.e.m.; two-sided P values by unpaired Student's t-test 
(b-d, f, g), two-way factorial ANOVA followed by Tukey’s post hoc 

test (h), or two-way repeated measures ANOVA (i, j) followed by post 
hoc paired or unpaired t-test with Bonferroni’s correction (i). b, e, 
Representative results from two independent experiments. Uncropped 
immunoblot images are available in Supplementary Fig. 1. 
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Extended Data Fig. 10 | See next page for caption. 


Extended Data Fig. 10 | The cell-autonomous role of SLC25A44 in 
brown adipocytes. a, Immunoblotting of SLC25A44 in human brown 
adipocytes expressing a scrambled control shRNA (Scr) and shRNAs 
targeting SLC25A44 (#1, #2). B-actin as a loading control. b, mRNA 
expression of SLC25A44 normalized to TBP levels in a. n = 3 per group. 

c, Noradrenaline-induced OCR normalized to total protein (jg) in the 
presence and absence of Val supplementation in a. Differentiated human 
brown adipocytes in the BCAA-free medium were supplemented with Val 
or vehicle, and subsequently treated with noradrenaline. n = 9 per group 
(Scr control, sh-Slc25a44 #1), n = 10 per group (sh-Slc25a44 #2). d, Mean 
noradrenaline-induced OCR in c. e, Illustration of Val metabolism in the 
mitochondria. f, Immunoblotting of mitochondrial proteins (as indicated) 
in the interscapular BAT of control and Sic25a44-KD mice. GAPDH 

as a loading control. g, ETC activity of BAT mitochondria. Isolated 
mitochondria from BAT of control mice and Slc25a44-KD mice were 
treated with rotenone (2 |1M), succinate (10 mM), antimycin A (5 1M) and 
TMPD (100 |tM) with ascorbate (Asc, 10mM). n = 5 per group. h, mRNA 
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expression of Slc25a44 normalized to 36B4 in mouse beige adipocytes 
expressing an empty vector (m = 3) or Slc25a44 (n = 4). i, Mitochondrial 
Val uptake in beige adipocytes in h. n = 3 per group. j, Noradrenaline- 
induced OCR in h. Differentiated adipocytes in the BCAA-free medium 
were supplemented with Val or vehicle and subsequently stimulated 

with noradrenaline. Vector: n = 20 (vehicle) and 16 (Val). Slc25a44: 

n = 13 (vehicle) and 16 (Val). k, Immunoblotting of SLC25A44 in C2C12 
myotubes expressing an empty vector or Slc25a44. B-actin as a loading 
control. 1, Val oxidation normalized to total protein (1g) in C2C12 
myotubes in k. n = 6 per group. m, OCR normalized to total protein 

(ug) in C2C12 myotubes in k. n = 9 per group. b-d, g-j, 1, m, Biologically 
independent samples. Data are mean + s.e.m.; two-sided P values 

by unpaired Student's t-test (h, i, 1, m), one-way (b) or two-way 

(d, j) factorial ANOVA followed by Tukey’s post hoc test, or two-way 
repeated measures ANOVA (c, g). a, f, k, Representative results from two 
independent experiments. Uncropped immunoblot images are available in 
Supplementary Fig. 1. 
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Reporting Summary 


Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency 
in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist. 


Statistical parameters 


When statistical analyses are reported, confirm that the following items are present in the relevant location (e.g. figure legend, table legend, main 
text, or Methods section). 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


An indication of whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistics including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND 
variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Clearly defined error bars 
State explicitly what error bars represent (e.g. SD, SE, Cl) 


Our web collection on statistics for biologists may be useful. 


Software and code 


Policy information about availability of computer code 


Data collection PET-CT image: VOX-BASE workstation (for humans), Amide 1.0.4 (for mice) 
RNAseq analysis: Kallisto 0.44.0; TopHat version 2.0.8; Cuffdiff 2.1.1; Metascape; Ingenuity Pathway Analysis; tximport 1.10.0. 
QPCR: QuantStudio Real-time PCR system 1.2v 
Seahorse: Wave 2.4 
EMG: LabChart 8 


Data analysis Microsoft office Excel 2016; SPSS statistics version 25.0; R package version 3.6.1 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers 
upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


2) 
je) 
ae 
= 
s 
a>) 
= 
OD 
Wn 
O 
je) 
= 
a) 
= 
= 
a) 
xe) 
e) 
= 
=e 
= 
© 
n 
= 
= 
= 
je) 
cS 
=< 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


RNA-seq dataset generated in this study (Extended Data Fig.9c, 9g) is available at Array Express under the accession code E-MTAB-7987 (For reviewers accession 
code: Username: Reviewer_E-MTAB-7987, Password: ww5qvwph). Publicly available array datasets used in this study are GSE51080 (Extended Data Figure 2e), 
ArrayExpress E-MTAB-2602 (Extended Data Figure 2e), ArrayExpress MTAB-4031 (Extended Data Figure 2g), ArrayExpress E-MTAB-2624 (Extended Data Figure 2h). 
RNA-seq analysis in Extended Data Figure 6a is available in Source Data Extended Data Figure 6. Publicly available proteomics analysis dataset (Sustarsic EG et al. Cell 
Metab 28: 159-174, 2018 doi: 10.1016/j.cmet.2018.05.003) was used in Extended Data Figures 2f, 4c. 13C-labeled Leu metabolic tracing data is available in 
Supplementary Table 3. 


Field-specific reporting 


Please select the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/authors/policies/ReportingSummary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size The sample size was determined by the power analysis with a = 0.05 and power of 0.8, developed by Cohen (1988), and based on our 
experience with experimental models, anticipated biological variables, and previous literatures. Sample numbers were described in the Figure 
legends. 


Data exclusions No data were excluded in the study. 


Replication All the biological experiments were repeated, at least, twice and reproduced. RNA-sequencing and metabolomics were performed once but 
three independent samples were analyzed and further validated by alternative approaches, such as qRT-PCR. Western blotting data were 
confirmed by two or three independent samples. 


Randomization Mice were randomly assigned at the time of purchase or weaning to minimize any potential bias. This is described in the Method (animals). 


Blinding The metabolite analyses in human sera and mouse plasma, the [13C6, 15N1] leucine tracing in human brown adipocytes, the PET/CT 
examination using 18F-FDG (in humans) or 18F-Fluciclovine (in mice), and GTT/ITT in mice fed high fat diet were performed by the authors 
who were blinded to the experimental groups. RNA sequencing and library constructions were performed by technical staffs at the UCLA 
genome core who were blinded to the experimental groups. RNA sequencing alignment were performed by the authors who were blinded to 
the experimental groups. Blinding was not relevant to the other experiments in mice or cells because mice or cells had to be genotyped by 
PCR. 


Reporting for specific materials, systems and methods 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Unique biological materials ChIP-seq 
Antibodies Flow cytometry 
| Eukaryotic cell lines [| MRI-based neuroimaging 
| Palaeontology 
Animals and other organisms 
Human research participants 
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Unique biological materials 


Policy information about availability of materials 


Obtaining unique materials 


Antibodies 


All unique materials used are available from the authors upon reasonable request. 


Antibodies used 


Validation 


Eukaryotic cell lines 


Following antibodies were used in this study: 


- Anti-BCAT1 antibody, mouse monoclonal, 1:1000 (WB), OriGene (TA504360) 

- Anti-BCAT2 antibody, rabbit polyclonal, 1:1000 (WB), Cell Signaling Tech (9432) 

- Anti-BCKDHA antibody, mouse monoclonal, 1:2000 (WB), Santa Cruz (sc-271538) 

- Anti-COX-IV antibody, rabbit monoclonal, 1:2000 (WB), Cell Signaling Tech (4850) 

- Anti-Flag antibody, mouse monoclonal, 1:2000 (WB), Sigma (A8592) 

- Anti-GAPDH antibody, mouse monoclonal, 1:2000 (WB), Santa Cruz (sc-32233) 

- Anti-GFP antibody, chicken polyclonal, 1:200 (staining), Aves labs (GFP-1020) 

- Anti-chicken IgG, Alexa Fluor 488, Goat polyclonal, 1:500 (staining), Life Technologies (A11039) 

- Anti-OXPHOS antibody cocktail, mouse monoclonal, 1:2000 (WB), Abcam (ab110413) 

- Anti-PDH-E1a antibody, mouse monoclonal, 1:1000 (WB), Santa Cruz (sc-377092) 

- Anti-PDH-E1a (pSer232) antibody, rabbit polyclonal, 1:1000 (WB), Millipore (AP1063) 

- Anti-PDH-E1a (pSer293) antibody, rabbit monoclonal, 1:1000 (WB), Abcam (ab177461) 

- Anti-PDH-E1a (pSer300) antibody, rabbit polyclonal, 1:1000 (WB), Millipore (AP1064) 

- Anti-SLC25A44 antibody, rabbit polyclonal, 1:1000 (WB), GeneScript (custom order; it was generated by using amino acids 
(MEDKRNIQIEWEHLDKKKC, MMQRKGEKMGRFQVC, and CKKLSLRPELVDSRH) as epitopes for immunization in rabbit. 
- Anti-TOM20 antibody, rabbit polyclonal, 1:2000 (WB), Proteintech (11802-1-AP) 

- Anti-UCP1 antibody, rabbit polyclonal, 1:2000 (WB), Abcam (ab-10983) 

- Anti-B-actin antibody, mouse monoclonal, 1:10000 (WB), Sigma (A3854) 


All antibodies were validated for the application and species used in this study by their manufacturers and by the authors. 
Antibodies were validated based on the size of band in western blotting (molecular weight), specificity/selectivity assessed by 
using samples from knockout mouse/knockdown mouse/knockdown cells/over expression cells, and reproducibility of the 
results. 


- Anti-BCAT1 antibody: https://www.origene.com/catalog/antibodies/primary-antibodies/ta504360/bcat1-mouse-monoclonal- 
antibody-clone-id-oti3f5 

- Anti-BCAT2 antibody: https://www.cellsignal.com/products/primary-antibodies/bcat2-antibody/9432 

- Anti-BCKDHA antibody: https://www.scbt.com/scbt/product/bckdela-antibody-h-5 ?productCanUrl=bckde1a-antibody 

- Anti-COX-IV antibody: https://www.cellsignal.com/products/primary-antibodies/cox-iv-3e11-rabbit-mab/4850 

- Anti-Flag antibody: https://www.sigmaaldrich.com/catalog/product/sigma/a8592 ?lang=en&region=US 

- Anti-GAPDH antibody: https://www.scbt.com/scbt/product/gapdh-antibody-g-9? 
gclid=CjOKCOQjw9pDpBRCkARIsAOzRzivPevMxfm1PECD4RaBDNpWS5UMHHdAHUYTrRaPub43MCQxsOljzk_zEaAUIEALW_wcB 
- Anti-GFP antibody: https://www.aveslabs.com/collections/epitope-tag-6xhis-beta-gal-actin-and-gfp-antibodies/products/ 
green-fluorescent-protein-gfp-antibody 

- Anti-chicken IgG, Alexa Fluor 488: https://www.thermofisher.com/antibody/product/Goat-anti-Chicken-IgY-H-L-Secondary- 
Antibody-Polyclonal/A-11039 

- Anti-OXPHOS antibody cocktail: https://www.abcam.com/total-oxphos-rodent-wb-antibody-cocktail-ab110413.html? 
gclid=CjOKCOQjw9pDpBRCkARIsAOzRziujlUITLatVARYKCilKliq1 Wu_- 
hEZ13i6EOvOrr9vd03Z4rUF5CowaAhY9EALW_wcB&productWallTab=ShowAll 

- Anti-PDH-E1a antibody: https://www.scbt.com/scbt/product/pdh-elalpha-antibody-d-6 

- Anti-PDH-E1a (pSer232) antibody: https://www.emdmillipore.com/US/en/product/PhosphoDetect-Anti-PDH-E1-pSer232- 
Rabbit-pAb,EMD_BIO-AP1063 

- Anti-PDH-E1a (pSer293) antibody: https://www.abcam.com/pyruvate-dehydrogenase-e1-alpha-subunit-phospho-s293- 
antibody-epr12200-ab177461.html 

- Anti-PDH-E1a (pSer300) antibody: https://www.emdmillipore.com/US/en/product/PhosphoDetect-Anti-PDH-E1-pSer300- 
Rabbit-pAb,EMD_BIO-AP1064 

- Anti-SLC25A44 antibody: Fig 4b; Ext Fig 6d; Ext Fig 7f-h,|; Ext Fig 9b; Ext Fig 10a,k. 

- Anti-TOM20 antibody: https://www.ptglab.com/products/TOM20-Antibody-11802-1-AP.htm 

- Anti-UCP1 antibody: https://www.abcam.com/ucp1-antibody-ab10983.html 

- Anti-B-actin antibody: https://www.sigmaaldrich.com/catalog/product/sigma/a3854?lang=en&region=US 


Policy information about cell lines 


Cell line source(s) 


Preadipocytes were isolated from the interscapular BAT or the inguinal WAT of either wild type or Ucp1 -/- mice and 
immortalized by infecting retrovirus expressing SV-Large T antigen. 

Mouse embryonic fibroblasts (MEF) were isolated from dCas9-KRAB mice and immortalized by infecting retrovirus expressing 
SV-Large T antigen. 

Preadipocytes, isolated from either the supraclavicular BAT or subcutaneous WAT of healthy human participants, were 
immortalized, as reported previously (Shinoda et al. Nautre Medicine 2015). 
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C2C12 (CRL-1772) and HEK293S (CRL-3022) were purchased from ATCC. 
Neuroblastoma cell line (Neuro2a) was purchased from Sigma-Aldrich (89121404). 


Authentication RNA-sequencing of the cell lines provide authentification. 
Mycoplasma contamination All the cell lines were routinely tested for mycoplasma infection and confirmed as negative for mycoplasma contamination. 


Commonly misidentified lines No commonly misidentified cell line was used. 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals Mus musculus was used as an animal model. 
The strains were as followed: 
- C57BL/6) mice were obtained from the Jackson Laboratory. 
- Ucp1-Cre mice (Stock No. 024670) were obtained from the Jackson Laboratory. 
- Pparg flox mice (Stock No. 004584) were obtained from the Jackson Laboratory. 
- Bckdhatm1a(EUCOMM)Hmgu mice were obtained from EUMMCR and used for generation of Bckdha flox mice. 
- BAT-specific Bckdha-KO mice were generated by crossing Bckdha flox mouse and Ucp1-Cre mouse. 
- dCas9-KRAB mice were generated by a site-specific integrase-mediated approach (TARGATT system), in which a construct 
having dCas9-KRAB was integrated into H11 locus to express dCas9-KRAB ubiquitously. 
- Transgenic (Tg) mice of gRNA targeting Slc25a44 were generated by conventional transgenics. 
- Slc25a44 KD mice were generated by crossing dCas9-KRAB mouse and Slc25a44-gRNA mouse. 
Adult males and females aged 8-16 weeks were used for the experiments. Littermate controls with same sex were used. All the 
mice had free access to food and water, 12 hr light cycles, and were caged at 23 C. 
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Wild animals This study did not involve wild animals 


Field-collected samples This study did not involve samples collected from the field. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics Thirty-three healthy young Japanese male volunteers participated in this study (23.4 + 0.58 years old). BMI was within normal 
range (21.0 + 0.30 kg/m2). 


Recruitment Volunteers were recruited by trial awareness-raising posters and consultant email-out. Potential selection biases were not 
detected/observed. 
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Non-line-of-sight imaging using phasor-field 


virtual wave optics 


Xiaochun Liu!, Ibén Guillén’, Marco La Manna, Ji Hyun Nam!, Syed Azer Reza’, Toan Huu Lel, Adrian Jarabo?, 


Diego Gutierrez? & Andreas Velten!** 


Non-line-of-sight imaging allows objects to be observed when 
partially or fully occluded from direct view, by analysing indirect 
diffuse reflections off a secondary relay surface. Despite many 
potential applications’, existing methods lack practical usability 
because of limitations including the assumption of single scattering 
only, ideal diffuse reflectance and lack of occlusions within the 
hidden scene. By contrast, line-of-sight imaging systems do not 
impose any assumptions about the imaged scene, despite relying 
on the mathematically simple processes of linear diffractive wave 
propagation. Here we show that the problem of non-line-of- 
sight imaging can also be formulated as one of diffractive wave 
propagation, by introducing a virtual wave field that we term the 
phasor field. Non-line-of-sight scenes can be imaged from raw time- 
of-flight data by applying the mathematical operators that model 
wave propagation in a conventional line-of-sight imaging system. 
Our method yields a new class of imaging algorithms that mimic the 
capabilities of line-of-sight cameras. To demonstrate our technique, 
we derive three imaging algorithms, modelled after three different 
line-of-sight systems. These algorithms rely on solving a wave 
diffraction integral, namely the Rayleigh-Sommerfeld diffraction 
integral. Fast solutions to Rayleigh-Sommerfeld diffraction and 
its approximations are readily available, benefiting our method. 
We demonstrate non-line-of-sight imaging of complex scenes with 
strong multiple scattering and ambient light, arbitrary materials, 
large depth range and occlusions. Our method handles these 
challenging cases without explicitly inverting a light-transport 
model. We believe that our approach will help to unlock the potential 
of non-line-of-sight imaging and promote the development of 
relevant applications not restricted to laboratory conditions. 

We have recently witnessed considerable advances in transient imaging 
techniques? that use streak cameras!!, gated sensors°, amplitude- 
modulated continuous waves!?, single-photon avalanche diodes 
(SPADs)’? or interferometry"*. Access to time-resolved image infor- 
mation has led to advances in imaging of objects partially or fully 
hidden from direct view! **-7!5-!8: that is, non-line-of-sight (NLOS) 
imaging. Other methods are able to use information encoded in the 
phase of continuous light and do not use the time of flight*. In the basic 
configuration of an NLOS system, light bounces offa relay wall, travels 
to the hidden scene, then propagates back to the relay wall and finally 
reaches the sensor. 

Recent NLOS reconstruction methods are based on heuristic 
filtered backprojection??*”"? or attempt to compute inverse operators 
of simplified forward light transport models*”°. These simplified 
models do not take into account multiple scattering, surfaces with 
anisotropic reflectance or, with a few exceptions”, occlusions or clutter 
in the hidden scene. The depth range that can be recovered is also 
limited, partially owing to the difference in intensity between first- and 
higher-order reflections. Existing methods are thus limited to carefully 
controlled cases, imaging isolated objects of simple geometry with 
moderate or no occlusion. Whereas the goal of previous works has 


been limited to the reconstruction of hidden geometry, we develop a 
theoretical framework for general NLOS imaging, reconstructing the 
irradiance at a virtual sensor; this enables applications beyond geomet- 
ric reconstruction. 

Time-of-flight LOS imaging has used a phasor formalism (a pha- 
sor, or phase vector, is a complex number representing properties of 
a light wave) together with Fourier domain ranging” to describe the 
emitted modulated light signal. Kadambi et al.”’ extended this concept 
to reconstruct NLOS scenes by using a phasor model along with a non- 
line-of-sight capture system that uses intensity-modulated light sources 
and gain-modulated detection. We show that a similar description can 
be used to model the physics of light transport through the scene. The 
key insight is that propagation through a scene of intensity-modulated 
light can be modelled using a Rayleigh-Sommerfeld diffraction (RSD) 
operator acting on a quantity that we term the phasor field. This allows 
us to formulate any NLOS imaging problem as a wave imaging prob- 
lem (Fig. 1) and to transfer well-established insights and techniques 
from classic optics into the NLOS domain. Given a captured time- 
resolved dataset of light transport through an NLOS scene, and a choice 
of a template LOS imaging system, our method provides a recipe that 
results in an NLOS imaging algorithm mimicking the capabilities of 
the corresponding LOS system. This template system can be any real 
or hypothetical wave imaging system that includes a set of light sources 
and detectors. The resulting algorithms can then be efficiently solved 
using diffraction integrals such as the RSD, for which various fast exact 
and approximate solvers exist””. Supplementary Information section 
A illustrates this. 

We start by mathematically defining our phasor field P(x, t). Let 
€(x, t) (with units /W m~’) bea quasi-monochromatic scalar field at 
position x € S and time t, incident on (or reflected from) a Lambertian 
surface S, with centre frequency Qo and bandwidth AQ « Qo. We can 
then define 


aes ion 
P(x, th=( — J |E(x, t)|? dt’ \ — = |E(x, t’)|? dt’ } (1) 
vii t-T/2 


as the mean subtracted irradiance (in watts per square metre) at point 
x and time ¢. The (-) operator denotes spatial speckle averaging (for the 
reflected case) accounting for laser illumination, and 7 represents 
the averaging of the intensity at a fast detector, with 7 1/AQ «T. 
The second integral in equation (1) is a long-term average intensity 
over an interval T > 7 of the signal as seen by a conventional non- 
transient photodetector. Now, let us define the Fourier component of 
P(x, t) for frequency w as 


+00 
Px) = J P(x, t) edt (2) 
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Fig. 1 | NLOS as a virtual LOS imaging system. a, b, Capturing scene 

data. a, A pulsed laser sequentially scans a relay wall (green); b, the light 
reflected back from the scene onto the wall is recorded at the sensor, 

yielding an impulse response H of the scene. c, Virtual light source. The 
phasor-field wave of a virtual light source P(x,, t) is modelled after the 


from which we can define a monochromatic component of the phasor 
field P(x, f) as 

P(x, t) =Po,,(x) e* (3) 
Using the above, our phasor field P(x, t) can be expressed as a superpo- 
sition of monochromatic plane waves as P(x, t) = : P(x, t) dw/2n. 


Since P(x, t)is a real quantity, the Fourier components 7 (x) are com- 
plex and symmetric about w = 0. Note that, in many places in this 
Letter, we assign P(x, t) an explicitly complex value; in these cases, it is 
implied that the correct real representation is (Plx, t)+ P*(x, t)). In 
practice, the complex conjugate can be safely ignored in our calcula- 
tions. As can be seen in Supplementary Information section B, given 
an isotropic source plane S and a destination plane D, and assuming 
that the electric field at S is incoherent, the propagation of its mono- 
chromatic component P(x, t) is defined by an RSD-like propagation 
integral: 


ik|xg—x,| 
seein 
|xg—X,| 


Plegt)=7 f Puleot) (4) 
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Virtual aperture 


Fig. 2 | Reconstructions of a complex NLOS scene. a, Photograph 

of the scene as seen from the relay wall. The scene contains occluding 
geometries, with objects towards the front (such as the chair) partially 
occluding the objects further back; multiple anisotropic surface 
reflectances; large depth; and strong ambient and multiply scattered light. 


wavefront of the light source of the template LOS system. d, The scene 
response to this virtual illumination P(x,, t) is computed using H. e, The 
scene is reconstructed from the wavefront P(x,, t) using wave diffraction 
theory. The function (-) is also taken from the template LOS system. 
Amp., phasor-field amplitude. 


where 7¥ is an attenuation factor, k = 27/. is the wavenumber for wave- 
length \ = 2m/w, x, € S and xq € D. Note that, as described 
in Supplementary Information section B, we approximate 7 as a con- 
stant over the plane S as y~1/|(S) — x,|3; this approximation has a 
minor effect on the signal amplitude at the sensor but does not change 
the phase of our phasor field. Although equation (4) is defined for 
monochromatic signals, it can be used to propagate broadband signals 
by propagating each monochromatic component independently; this 
can be efficiently done by time-shifting the phasor field (more details 
are provided in Supplementary Information section B.1). 

The key insight of equation (A) is that, given the assumption that 7 is 
a constant, the propagation of our phasor field is defined by the same 
RSD operator as any other physical wave. Therefore, to image a scene 
from a virtual camera with aperture on plane C, we can apply the image 
formation model of any wave-based LOS imaging system directly over 
the phasor field P(x,, t) at the aperture, with x, € C. The challenge is 
how to compute P(x, t) from an illuminating input phasor field 
P(x, t), where Xp is a point in the virtual projector aperture P, given a 
particular NLOS scene (see Fig. 1). 

Because light transport is linear in space and time-invarian we 
can characterize light transport through the scene as an impulse 


23:24, 


b, 3D visualization of the reconstruction with phasor fields (A = 6 cm). 
We include the relay wall location and the coverage of the virtual aperture 
for illustrative purposes. c, Frontal view of the scene, captured with an 
exposure time of 10 ms per laser position. d, Frontal view captured with an 
exposure time of just 1 ms (24 s for the complete scan). 
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Relay wall 


Fig. 3 | Robustness of our technique. a, Reconstruction in the presence of strong ambient illumination (all the lights on during capture). b, Hidden scene 
with a large depth range, leading to very weak signals from objects farther away. 


response function H(x, — x,, t), where x, and x, are the positions of 
the emitter and detector, respectively. The phasor field at the virtual 
aperture P(x, t) can thus be expressed as a function of the input phasor 
field P(x,» t)and H(xp — x, t): 


P(x.t) = J [P(x,»t) *H(x, x, t)]dx, 


P 


(5) 


where * denotes the convolution operator. Any imaging system can be 
characterized by its image formation function &(-), which transduces 
the incoming field into an image 


Ixy) = (P(x, t) (6) 
where x, is the point being imaged (that is, the point at the virtual 
sensor). This, in turn, can be formulated as an RSD propagator, requiring 
a diffraction integral to be solved to generate the final image. 

In an NLOS scenario, H(x, — x,, t) usually corresponds to five- 
dimensional transients acquired by an ultrafast sensor focused on x, 
and sequentially illuminating the relay wall with short pulses at differ- 
ent points x, (see Fig. 1 and Methods). Points x, and x, correspond to 
a virtual LOS imaging system projected on the relay wall. Once 
Hi (xp — Xz, t) has been captured, both the wavefront P(x ps t) and the 
imaging operator &(-) can be implemented computationally, so they 
are not bounded by hardware limitations. We can leverage this to use 
different P(x, t) functions from any existing LOS imaging system”> to 
emulate its characteristics in an NLOS setting. 


Exact defocus 


NL 
05 


* MM 


ay 


aN 


Refocusing 


sv 


Transient video 


Fig. 4 | Additional NLOS imaging applications of our method. a, NLOS 
refocusing. The hidden letters (left) are progressively brought in and out 
of focus as seen from a virtual photography camera at the relay wall, using 
the exact lens integral (blue border), and the faster Fresnel approximation 
(red border). b, NLOS transient video. Example frames of light travelling 
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We illustrate the robustness and versatility of our method by imple- 
menting three virtual NLOS imaging systems based on common LOS 
techniques: a conventional photography camera capable of imaging 
NLOS scenes without knowledge of the timing or location of the 
illumination source; a transient photography system capable of cap- 
turing transient videos of the hidden scene revealing higher-order 
interreflections (multiple light bounces between surface elements) 
beyond third bounce; and a confocal time-gated imaging system 
robust to interreflections. An in-depth description of these example 
imaging systems is provided in Supplementary Information section 
C, including their corresponding P(x,, t) functions and imaging 
operators, and section D describes some examples of practical 
integral solvers. 

The spatial resolution of our virtual camera is A, = 0.61L/d, where 
d is the virtual aperture diameter and L is the imaging distance. The 
distance A, between sample points x, in P (see Fig. 1) has to be small 
enough to sample H at the phasor-field wavelength. We fix A, = 1 cm 
and, unless stated otherwise, \ = 4 cm. The minimum sampling rate is 
Ap < /2; in practice, we found A> = \4 to provide the best trade-off 
between reconstruction noise and resolution. 

The computational cost of our algorithm is bounded by the RSD 
solver computing the image formation model ®(-). Fast diffraction 
integral solvers exist”*, with complexity O(N*logN). For the particular 
case of our confocal system, we formulate the algorithm as a backpro- 
jection (see Supplementary Information section D.2 for details), and so 
we are bounded by the computational cost of the backprojection 
algorithm used. 


Fresnel approx. 


12.61 ns 


13.27 ns 


through a hidden office scene when illuminated by a pulsed laser. 
Timestamps indicate the propagation time from the relay wall. Frames with 
a green border show third-bounce objects, frames with an orange border 
show fourth- and fifth-bounce effects. 


One common application of NLOS imaging is the reconstruction 
of hidden geometry. Figure 2 shows the result for a complex scene 
imaged with our virtual confocal camera. This challenging scene con- 
tains multiple objects with occlusions distributed over a large depth, 
a wide range of surface reflectances and albedos, and strong interre- 
flections. Our method is able to image many details of the scene, at the 
correct depths, even with an ultra-short (1 ms) exposure. More analysis 
on the robustness of our method to capture noise can be found in the 
Methods. For simpler scenes (no occlusions, limited depth, controlled 
reflectance and no interreflections), our method yields results on par 
with current techniques, which already approach theoretical limits for 
reconstruction quality (see Methods). 

In Fig. 3, we demonstrate the robustness of our method when dealing 
with other challenging scenarios, including strong multiple scatter- 
ing and ambient illumination (Fig. 3a), or a high dynamic range from 
objects spanning a large range of depths (Fig. 3b). Finally, our method 
allows new NLOS imaging systems and applications to be implemented, 
making use of the wealth of tools and processing methods available 
in LOS imaging. Figure 4a demonstrates NLOS refocusing with our 
virtual photography camera, computed using both the exact RSD oper- 
ator and a faster Fresnel approximation, while Fig. 4b shows frames of 
NLOS femto-photography reconstructed using our virtual transient 
photography system, revealing fourth- and fifth-bounce components 
in the scene. The first, second and fourth frames, in green, show how 
light first illuminates the chair, then propagates to the shelf and finally 
hits the back wall 3 m away. The frames in orange show higher-order 
bounces. The third frame shows that the chair is illuminated again 
by light bouncing back from the relay wall, and the last two frames 
show how the pulse of light travels from the wall back to the scene 
(see Supplementary Video 1). A description of the Fresnel approxima- 
tion to the RSD operator, as well as the LOS projector-camera functions 
used in these examples, appear in Supplementary Information sections 
D.1 and C.2. 

In the Methods, we include comparisons against ground truth for 
two synthetic scenes, inside a corridor of 2m x 2m x 3 m to create 
interreflections, simulated using an open-source transient renderer?®; 
these scenes are included in a publicly available database”’. We analyse 
the robustness of our method with and without such interreflections; 
the reconstruction mean square error (MSE) does not increase, remain- 
ing below 5 mm. Finally, we progressively vary the specularity of the 
hidden geometry, from purely Lambertian to highly specular; again, 
the quality of the reconstructions does not vary significantly (MSE of 
about 2 mm). 

The examples shown highlight the primary benefit of our approach. 
By turning NLOS into a virtual LOS system, the intrinsic limitations of 
previous approaches no longer apply, enabling a class of NLOS imaging 
methods that take advantage of existing wave-based imaging methods. 
Formulating NLOS light propagation as a wave does not impose limi- 
tations on the types of problems that can be addressed, nor the datasets 
that can be used. Any signal can be represented as a superposition of 
phasor-field waves; our formulation can thus be viewed as a choice 
of basis to represent any kind of NLOS data. Expressing the NLOS 
problem this way allows a direct analogy to LOS imaging, which can 
be exploited to derive suitable imaging algorithms and to implement 
them efficiently. 

We have shown three imaging algorithms derived from our method. 
Our results include more complex scenes than in NLOS reconstructions 
shown so far in the literature, as well as new applications. In addition, 
our approach is flexible, fast, memory-efficient and lacks computational 
complexity since it does not require inverting a light transport model. 
We anticipate that it can be applied to other LOS imaging systems, for 
instance to separate light transport into direct and global components, 
or to use the phase of P,, for enhanced depth resolution. Our virtual 
imaging system could also be used to create a virtual imaging system to 
see around two corners, assuming the presence of a secondary relay 
Lambertian surface in the hidden scene, or to select and manipulate 
individual light paths to isolate specific aspects of the light transport in 
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different NLOS scenes. In that context, combining our theory with light 
transport inversions, via, for example, an iterative approach, could poten- 
tially lead to better results and is an interesting avenue for future work. 
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METHODS 


Details on data acquisition. Hardware configuration. Our capture system, shown 
in Extended Data Fig. 1, consists of a Onefive Katana HP amplified diode laser 
(1 W at 532 nm, and a pulse width of about 35 ps used at a repetition rate of 
10 MHz) and a gated SPAD detector processed by a time-correlated single-photon 
counter (PicoQuandt HydraHarp), with a time resolution of about 30 ps and a 
dead time of 100 ns. Two additional charge-coupled device (CCD) cameras are 
used to calibrate the laser’s position. The measured time resolution of our system 
is approximately 65 ps, a combination of the pulse width of the laser and the time 
jitter of the system. 

NLOS measurement geometry. We obtain an impulse response function H(xp — x, t) 

of the scene by sequentially illuminating points x, on the relay wall with a short 

pulse and detecting the signal returning at points x. 

Our hardware device is located 2.5 m from the relay wall, with the NLOS 

scenes hidden from direct view. The field of view is 25°. The walls are made of 
standard white styrofoam. The scanning area in the relay wall (virtual camera) is 
1.8m x 1.3 m, with laser points x, spaced by A, = 1 cm in each direction. The 
SPAD is focused at a position near the centre of the grid. We avoid scanning a small 
square region around the SPAD focused position (the confocal position) because 
the signal becomes very noisy at this location. Figures 2, 3 provide additional details 
for the specific scenes shown. 
Exposure time. Our capture set-up includes CCD cameras (Extended Data Fig. 1) to 
confirm the 3D position of every laser during the measurement; these are a limiting 
factor in the speed of our experiments. Because the capture process runs in parallel, 
we use a very long (1 s) exposure time per laser position for some datasets. They 
are used for all results unless otherwise specified. In addition, we capture scenes 
without the additional CCD photographs that can be collected much faster and 
with much shorter exposure times. In Fig. 2, we show datasets of an office scene 
captured with exposure times of 1 ms to 10 ms per laser position, which results 
in a total capture time as low as 24 s. Further reconstructions of a shelf dataset are 
shown later as additional results, showing that we can reduce exposure times at 
least down to 50 ms per data point without a significant loss in quality, even with 
ambient light. This results in less than 20 min of total capture time. In our current 
prototype, we capture data sequentially with a single SPAD. Prototype SPAD arrays 
are currently under development, and it seems likely that a 16 x 16 array will be 
available by the end of the year. We thus expect to be able to capture 256 data points 
in less than 0.1 s in the near future. 

Collected data. In total (counting captures with different lighting and exposure 

times as different sets), we use 12 experimental and two simulated datasets. All 

experimental datasets use a single SPAD location and 180 by 130 laser positions. 

The datasets and exposure times are: 

e An office scene collected with 1 s exposure per laser position. This dataset is 
used to create the video shown in Supplementary Video 1, frames of which are 
shown in Fig. 4b. A photograph and reconstruction of this scene is also shown 
in the Supplementary Video. The data are analysed in Extended Data Fig. 3 and 
Extended Data Table 1. 

e An office scene collected with exposures of 10 ms, 5 ms and 1 ms, used in Fig. 2, 
Extended Data Figs. 6-8 and Extended Data Table 1. 

e A scene of a bookshelf used in Fig. 3a and in Extended Data Table 1. 

e A scene of a bookshelf captured with various exposure times and ambient light 
conditions, shown in Extended Data Fig. 2 and Extended Data Fig. 5. 

e A scene with letters distributed over a large depth, used in Fig. 3b and Extended 
Data Table 1. 

e A scene of the letters NLOS in a plane, used in Fig. 4a and Extended Data 
Table 1. 

To provide further insight into the noise and artefacts present in our data, we go 
through an analysis of the raw data from our 1-s-exposure office scene. We compare 
the maximum and average number of photons per second and laser position x, for 
our captured scenes in Extended Data Table 1. The dark count rate of our detector 
is 10 photons per second. We do not explicitly subtract dark counts nor ambient 
light or backgrounds. The high total photon numbers in the transient responses 
(Extended Data Table 1) are due to the long responses associated with the large 
depth and volume of the scenes, and not due to a particularly bright signal. Example 
data for a scene of a shelf are shown in Extended Data Fig. 2 (whose reconstruction 
can be found on Extended Data Fig. 5). In this scene, our longest (1 s) exposure time 
peaks at about 150 photons per second (such peaks are probably due to the presence 
of specular surfaces), and the captured signal is extremely noisy. In comparison, the 
recent method by O’ Toole et al.” acquires a brighter, cleaner signal in 0.1 s, peaking 
at about 600 photons per second, owing to the use of retroreflective paint applied on 
the hidden objects (data from their data_resolution_chart_40cm dataset). 

Let us further analyse the captured data. In Extended Data Fig. 3a, we show 
a visualization of our data matrix for the 1-s-exposure office scene using the 
Matlab function imagesc, in which each row is the data collected for a different 
location of the laser illumination spot, and each column contains a different time 


bin. The first time bin corresponds to the time when the illumination laser pulse 
leaves the relay wall. In the images, we do not show time bins 10,001 to 15,000 as 
they are mostly empty, owing to the closing of the gate. As can be seen, there are 
some sparse, very large peaks in the dataset that saturate the counting registers of 
our time-correlated single-photon counter (2! — 1 counts). As we will see, these 
artefacts in the data are likely to be due to imperfections in the gating or optical 
set-up. 

Let us focus on the first instants of the captured data shown in Extended Data 
Fig. 3a, which reveal features that look like straight diagonal lines in the first few 
time bins. The fact that there are straight lines in this plot indicates that they are 
likely related to a first-bounce signal, rather than the scene response. NLOS signals 
should show up as hyperbolas or sections of hyperbolas in this type of visualization, 
and the curvature of the hyperbolas should be highest at the earliest time bins. 
The image contains many more features that look like straight lines that do not 
appear to have the correct hyperbolic curvature to be NLOS signals. Many of them 
also appear identically in the other datasets, which is another hint that they are 
probably not real NLOS data but artefacts related to the measurement system. Our 
algorithm is completely agnostic to the presence of these artefacts. The brightest 
peaks also appear too early in the data to be associated with a NLOS object. To see 
this, consider that the closest object in any of our scenes is the chair in the office 
scene, and it is more than 50 cm away from the wall. Consequently, the first time 
response from an actual object cannot arrive at the SPAD earlier than 3.3 ns after 
the laser illuminates the relay wall. Time bins are 4 ps wide. Any data before time 
bin 833 therefore can only be an artefact. We will speculate more about the origin 
of these artefacts later. 

If we ignore those first 833 time bins that contain no useful data, we obtain a 
dataset that can yield some meaningful statistics about the data. In this dataset, 
the largest photon count in all our over 200 million time bins is smaller than 1,400 
photons. As we show below, this 1,400 maximum is probably still due to a gate 
artefact that happened to occur slightly later than 3 ns into the dataset. Statistics 
for all datasets are shown in Extended Data Table 1. 

Maximum photon counts usually come from the objects in the scene closest 
to the wall. Considering the large depth and specularities of our scenes, most of 
the reconstructed scene volume is using signals much weaker than the maximum 
signal, as voxels are further from the wall. Signals from a given surface are expected 
to drop in magnitude with distance L as 1/L*. An object generating 100 photon 
peaks at 50 cm distance in the front of our scene would therefore only create 100/8 
photons if placed at 1 m and 100/625 = 0.2 photons at 2.5 m towards the end of 
our office scene. This ability to handle scenes with large dynamic range in the data 
is another advantage of our algorithm. 

In Extended Data Fig. 3b, we show a plot of the photon counts over time bins for 
the laser position that received the most total photons. We again see the extreme 
peak of 2!° — 1 counts in the beginning of the dataset. Again, this peak cannot be 
a real third-bounce signal as it would require the pulse to travel between the laser 
position and SPAD position much faster than the speed of light. The actual NLOS 
data start around time bin 1,000 and peak at just above 50 photons. 

Finally, we show a plot of the laser position that received the total photon count 
closest to the median of all laser positions (Extended Data Fig. 3c). We can see that 
the count generally stays below 150 photons, with what are probably specular peaks 
reaching 200 photons and a large (450 photon) peak at the beginning of the dataset 
that is either a specular peak or another gate artefact. Note that as we illuminate 
only a grid of points at the wall, we do not capture all the specular peaks in our data. 
To see a specular reflection peak from a scene surface, we have to be lucky enough 
to illuminate the exact spot on the wall that results in the specular reflection that 
overlaps with the SPAD position (see Supplementary Fig. 2 for an illustration). 
Therefore, specular peaks in our measurements can vary greatly, depending on 
how close to the peak the laser sampled the wall. Again, we point out that this type 
of uncontrolled artefact does not affect our algorithm. 

As we stated above, the time bin with the highest photon count when ignoring 
obvious early artefacts contains about 1,400 photons. Next we plot the laser posi- 
tion that contains this time bin (Extended Data Fig. 3d). Note that zero on the x axis 
here corresponds to time bin 834. As we see, the 1,400 photon peak appears very 
close to the beginning of the transient and may be a gating artefact that occurs in 
the data just after the opening of the gate. This type of data distortion is described 
further below. If not a gating artefact, the peak is probably a specular reflection, 
as it is very narrow and could only be caused by a small isolated diffuse patch or a 
specular surface in the scene. Peaks from extended diffuse surfaces are necessarily 
longer in duration. 

We conclude that although our data contain artefacts, the photon counts 
useful for reconstructions are no higher or cleaner than in previous methods. 
Note that the removal of early artefacts is only done here to generate Extended 
Data Fig. 3b-d, to allow visualization. All reconstructions shown in the manu- 
script contain the full recorded data without the removal of any potential artefacts 
or time bins. 


Even though an understanding of the origin of the artefacts is not needed for 
our method, we can speculate on the sources of some of them. 

(1) Many of the early peaks in our data are likely to be related to imperfections in 
our gating method. When the SPAD gate opens just after the laser pulse has passed, 
photoelectrons in the SPAD may cause a detection event that is not due to a photon 
but to electrons excited by the first-bounce light and trapped in long-lived states 
in the SPAD. Even though these electrons are not amplified, they need to be trans- 
ported off the SPAD junction or they can cause counts as soon as the gate opens. 

(2) The gate may not block the pulse for some laser positions. The gate has to 
be positioned such that it blocks the laser in all laser positions while not blocking 
any signal. This is not always possible, and we do not re-adjust the gate for each 
position while scanning. 

(3) Effects inside the imaging system can keep light trapped long enough to 
cause a peak at the time when the NLOS data arrive. This can be due to multiple 
reflections between lenses, multiphoton fluorescence in the glass or coating of the 
lenses, or stray light reflecting off a random surface at the right distance. We have 
confirmed some of these effects but suspect there are many more. 

(4) In particular, we can see light that travels from the laser spot to the SPAD, 
reflects off the surface of the SPAD pixel, is imaged back to the relay wall and comes 
back to the SPAD. In confocal or near-confocal configurations, this can create a 
peak that is many times brighter than the data. 

Retroreflective targets can be used to reduce many of these artefacts, most of 
which are created either by the laser or a first-bounce reflection of the laser. If the 
hidden target is retroreflecting, the ratios between the brightness of the laser and 
its first bounce and the brightness of the third-bounce NLOS data are reduced by 
multiple orders of magnitude. 

Helmholtz reciprocity. Ideally, we would capture H(xp — x,, t) sampling points on 
both the projector aperture x, € P and the camera aperture x, € C. In our current 
set-up with a single SPAD, we only sample a single point for x,. From Helmholtz 
reciprocity, we can interpret these datasets as having a single xp and and array of 
x,. The choice of capture arrangement is made for convenience, as it is easier to 
calibrate the position of the laser spot on the wall. Improved results are anticipated 
once array sensors become available (currently under development). 
Additional validation and discussion. Resolution limits. The resolution limit 
for NLOS imaging systems with an aperture diameter d at imaging distance L 
is closely related to the Rayleigh diffraction limit’: A, = 1.22coL/d, with c the 
speed of light in vacuum, for a pulse of full width at half maximum co. O’Toole 
et al.? derive a criterion for a resolvable object based on the separability of the 
signal in the raw data, not in the reconstruction, resulting in a similar formula, 
A, = 0.5coL/d & 0.5AL/d. 

In our virtual LOS imaging system, we can formulate a resolution limit that 

ensures a minimum contrast in the reconstruction, based on the well-known 
resolution limits of wave-based imaging systems. The resolution limit therefore 
depends on the particular choice of virtual imaging system. For an imaging system 
that uses focusing only on the detection or illumination side, this limit is approxi- 
mated by the Rayleigh criterion. For an imaging system that provides focusing on 
both the light source and the detector side, the resolution doubles (as it does, for 
example, in a confocal or structured illumination microscope) and the resolution 
limit becomes becomes A, = 0.61AL/d. 
Effect of strong interreflections. To confirm the presence and effect of strong inter- 
reflections in our captured data, we compare the data qualitatively with primary 
data from a synthetic bookshelf scene, with and without interreflections. The 
bookshelf is placed in a corridor of 2m x 2m x 3 m, with only a single lateral 
aperture of 1 m x 2 m to allow the hidden scene to be imaged. The shelf has a size 
of 1.4m x 0.5 m, placed at 1.7 m from the relay wall and 0.3 m from the lateral 
walls. The virtual aperture has a size of 1.792 m x 1.792 m and a granularity of 
256 x 256 laser points; we use \ = 4A, and A, = 0.7 cm. 

As can be seen in Extended Data Fig. 4, the synthetic data clearly show how the 
presence of interreflections adds, as expected, low-frequency information resem- 
bling echoes of light. The same behaviour can be seen in the real captured data, 
revealing the presence of strong interreflections. 

Additionally, we evaluate the robustness of our method in the presence of such 
interreflections. Similar to recent work’, we compare between a voxelization of the 
ground-truth geometry and a reconstructed voxel-grid obtained from our irradi- 
ance reconstructions, with and without including interreflections; the resulting 
MSE is as follows: without interreflections (Extended Data Fig. 4a), MSE 4.93 mm; 
with interreflections (Extended Data Fig. 4b), MSE = 4.66 mm. 

Effect of exposure time. Ambient light. To analyse how well our technique works 
in ambient light and with much shorter exposure times, we perform several addi- 
tional measurements using progressively shorter exposure times, showing that 
we can reduce exposure times at least down to 50 ms per data point without a 
significant loss in quality (see Extended Data Fig. 5). Extended Data Fig. 2 shows 
raw data for one of the laser positions. In particular, it shows the number of photons 
per second accumulated in each time bin (that is, the collected histogram divided 
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by the integration time in seconds). As expected, all three curves appear to follow 
the same mean but have a larger variance for lower exposure times. The raw data 
thus become noisier as exposure time decreases. The effects on our reconstruction, 
however, are minor, as Extended Data Fig. 5 shows. 

Short-exposure captured data. Extended Data Fig. 6 shows the reconstruction of 
the office scene (Fig. 2) for short exposure times of 10 ms, 5 ms and 1 ms for each 
of the roughly 24,000 laser positions. This leads to total capture times of about 
4 min, 2 min and 24 s respectively. Plots showing raw data from those datasets are 
given in Extended Data Fig. 7. 

We compare the results of our reconstructions on the 1 ms data against 
filtered backprojection with a Laplacian filter°, as well as the Laplacian-of-Gaussian 
(LOG)-filtered backprojection’’, which generally achieves better results. We are 
not aware of any reconstruction method that consistently outperforms a LOG- 
filtered backprojection. Extended Data Fig. 8 shows the result of this comparison. 
Non-Lambertian surfaces. To validate the robustness of our method in the pres- 
ence of non-Lambertian materials in the hidden scene, we have created a synthetic 
scene made up of two letters, R and D, one partially occluding the other, placed in 
a corridor of 2m x 2m x 3 m, with only a single lateral aperture of 1 m x 2m to 
allow imaging the hidden scene. The letters have a size of 0.75 m x 0.8 m, placed at 
1.25 mand 1.7 m from the relay wall, respectively, and 0.5 m from the lateral walls 
(see Extended Data Fig. 9a). The virtual aperture has a size of 1.792 m x 1.792 m 
and a granularity of 128 x 128 laser points; we use \ = 4A, with A, = 1.4 cm. 
We start with purely Lambertian targets and progressively increase their specu- 
larity. We use the Ward BRDF model”’, decreasing the surface roughness, using 
available transient rendering software”®. The simulation includes up to the fifth 
indirect bounce. 

Extended Data Fig. 9b shows the resulting irradiance reconstructions. Because 

our method does not make any assumption about the surface properties of the 
hidden scene, the changes in material appearance do not significantly affect our 
irradiance reconstructions. Similar to recent work’, we compare a voxelization of 
the ground-truth geometry and the reconstructed voxel-grid; the resulting MSE for 
each of the different reflectances is as follows: for a surface roughness of 1 (perfect 
Lambertian), MSE = 2.1 mm; for a surface roughness of 0.4, 2.2 mm; for a surface 
roughness of 0.2, MSE = 2.2 mm. 
Reconstruction comparison with other methods. Our imaging system allows 
hidden geometry to be reconstructed. For this application, we show a comparison 
using the publicily available confocal dataset. This set can be reconstructed using 
different NLOS methods; we show results for confocal NLOS deconvolution’, 
filtered backprojection’ and our proposed method. For these confocal measure- 
ments, backprojection can be expressed as a convolution with a pre-calculated 
kernel, and thus all three methods are using the same backprojection operator. 
Neither our method nor filtered backprojection is limited to confocal data, and 
both can be acquired by making use of simpler devices and capture configura- 
tions. They can thus be applied to a broader set of configurations and considerably 
more complex scenes. For the confocal NLOS deconvolution method’, we leave 
the optimal parameters unchanged. For our proposed virtual wave method, we use 
the aperture size and its spatial sampling grid (see Supplementary Information) to 
calculate the optimal phasor-field wavelength. For the filtered backprojection, it 
is important to choose a good discrete approximation of the Laplacian operator in 
the presence of noise. Previous works implicitly do the denoising step by adjusting 
the reconstruction grid size to approximately match the expected reconstruction 
quality”*’, or by downsampling across the measurements’. If used correctly, all 
of these methods result in a high-quality reconstruction from a Laplacian filter. 
To provide a fair comparison without changing the reconstruction grid size, we 
convolve a Gaussian denoising kernel with the Laplacian kernel, resulting in a LOG 
filter, which we apply over the backprojected volume. 

Note that a large improvement in reconstruction quality for the simple scenes 
included in the dataset (isolated objects with no interreflections) is not to be 
expected, since existing methods already deliver reconstructions approaching 
their resolution limits. We nevertheless achieve improved contrast and cleaner 
contours in our wave camera method, due to our better handling of multiply scat- 
tered light, which pollutes the reconstructions in the other methods (see Extended 
Data Fig. 10). 

In the noisy datasets (Extended Data Fig. 11), filtered backprojection fails. 
confocal NLOS includes a Wiener filter that performs well at removing uniform 
background noise, although a noise level must be explicitly estimated. Our 
phasor-field virtual wave method, on the other hand, performs well automatically, 
without the need to explicitly estimate a noise level. This is important in complex 
scenes with interreflections, where the background is not uniform across the scene, 
and the noise level cannot be reliably estimated. 

Nevertheless, our main contribution is not that of improving the reconstruction 
for simple, third-bounce scenes. Instead, our method allows a new class of NLOS 
algorithms to be derived, which can successfully handle scenes of much greater 
complexity. 
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Data availability 

The measured data and the phasor-field NLOS code supporting the findings 
of this study are available in the figshare repository https://doi.org/10.6084/ 
m9.figshare.8084987. Additional data and code are available from the correspond- 
ing authors upon request. 


Code availability 


Our data and reconstruction code can be found in the figshare repository https:// 
doi.org/10.6084/m9.figshare.8084987. 
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Extended Data Fig. 1 | Capture hardware used for the results shown in this Letter. 
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Extended Data Fig. 2 | Data comparison. a, Raw data for one of the laser 
positions xp. Shown is the number of photons per second accumulated in 
each time bin (that is, the collected histogram divided by the integration 
time in seconds). Time bins are 4 ps wide. As expected, all three curves 
appear to follow the same mean, but there is a larger variance for lower 
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exposure times. The raw data thus become noisier as exposure time 
decreases. The effects on the reconstruction are minor, as Extended Data 
Fig. 4 shows. Tacq acquisition time. b, Example dataset from ref. ° for 
comparison. 


LETTER 


500 1500 


Laser Pos index 
Photon Count 


1000 


Photon Count 
Photon Count 


QO 5000 10000 
Time Bin (4 ps) 


Time Bin (4 ps) 


Extended Data Fig. 3 | Visualization of the raw data for our long- 
exposure office scene. a, Base-10 logarithm of the photon counts in all 
time bins. Pos index, laser position index; the 24,000 laser positions on the 
wall are labelled with these consecutive numbers. b-d, After removal of 
the first 833 time bins in each dataset, the plots show: the photon counts 
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for the laser position that received the largest total number of photons in 
the dataset (b); the counts for the laser position that received the median 
number of photon counts (c); and the counts for the laser position that 
contains the time bin with the global maximum count in the entire set (d). 
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Extended Data Fig. 4 | Robustness to multiple reflections. Result for 
the synthetic bookshelf scene. a, Without interreflections. b, Including 
high-order interreflections. The quality of the results is very similar. 

c, Primary data (streak images) from the same scene without (top), and 
with interreflections (middle). The synthetic data clearly show how the 
presence of interreflections adds, as expected, low-frequency information 


resembling echoes of light. The bottom image shows primary data 
captured from the real office scene in Fig. 2. It follows the same behaviour 
as the middle image, revealing the presence of strong interreflections. 
Colours refer to numerical values from Matlab’s ‘fire colormap, in arbitrary 
units. 
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50 ms 100 ms 1000 ms 
Extended Data Fig. 5 | Robustness to ambient light and noise. a, Hidden — amount of ambient light (same conditions as the photograph in a), 
bookshelf. b, Imaging results with increasingly higher exposure times; the quality remains constant. c, Difference between the 50-ms- and 
even at 50 ms, there is no significant loss in quality. Top row, image using 1,000-ms-exposure captures for the lights-off case. 


only the pulsed laser as illumination source. Bottom row, on adding a large 
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Extended Data Fig. 6 | Short-exposure reconstructions. Reconstruction with 10 ms, 5 ms and 1 ms exposure time per laser. The total capture time 
of the office scene using very short capture times. a, Photograph of the was about 4 min, 2 min and 24 s, respectively. 


captured scene. b, From left to right, reconstructions for data captured 
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Extended Data Fig. 7 | Short-exposure data. Photon counts in the raw 
data for our office scene for 10 ms (top row), 5 ms (centre row) and 1 ms 
(bottom row) exposure times per laser position. After removing the first 
833 time bins in each dataset, the columns show: the photon counts for 
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the laser position that received the largest total number of photons in the 
dataset (left); the counts for the laser position that received the median 
number of photon counts (centre); and the laser position that contains the 
time bin with the global maximum count in the entire set (right). 
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a | b c 
Extended Data Fig. 8 | Comparison to prior methods. Reconstruction of the office scene using very short capture times of 1 ms per laser (24 s in total). 
a, Filtered backprojection using the Laplacian filter. b, LOG-filtered backprojection. c, Our method. 
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Extended Data Fig. 9 | Robustness to scene reflectance. a, Geometry (roughness 0.4 and roughness 0.2). The reconstructed irradiance is 
of our experimental set-up. b, From left to right, imaging results for the essentially the same for all cases. 


Lambertian targets (roughness 1) and increasingly specular surfaces 
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Extended Data Fig. 10 | Reconstruction comparison on a public 
dataset. From left to right: confocal NLOS deconvolution, filtered (LOG) 
backprojection (FBP) and our proposed method. A large improvement 
in reconstruction quality for the simple scenes included in the dataset 


x (m) 
(isolated objects with no interreflections) is not to be expected, as existing 
methods already deliver reconstructions approaching their resolution 
limits. Nevertheless, our method achieves improved contrast and cleaner 
contours, owing to better handling of multiply scattered light. 
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Extended Data Fig. 11 | Reconstruction comparison (noisy data). 


be explicitly estimated. Our phasor-field virtual wave method yields better 
From left to right: confocal NLOS deconvolution, FBP and our proposed results automatically. This is particularly important in complex scenes with 
method. Top row represents a non-retroreflective object; bottom row interreflections, where the background is not uniform across the scene, 
represents a retroreflective object captured in sunlight. In the presence of and the noise level cannot be reliably estimated. 
noisy data, FBP fails. Confocal NLOS includes a Wiener filter that needs to 
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Extended Data Table 1 | Photon statistics for captured data used in the paper and in Supplementary Information 


Total Photons | Photons/bin | Max. bin | Avg./ laser 

Large depth scene 3215722952 9.7 552 13742 
NLOS letters 6502986696 19.6 2889 27791 
Shelf 6158590767 18.6 2074 26319 

Office Scene 6201680972 18.7 1406 26503 
Office Scene 10 ms 48017499 0.14 18 2716 
Office Scene 5 ms 24012257 0.072 15 1026 
Office Scene 1 ms 4801568 0.014 6 205 


The first four scenes were captured with 1 s exposure time. The first column shows the total photons counted, t 


he second shows the average photon count per time bin, the third is the maximum count 


over all time bins, and the last contains the average number of photons collected in each laser position in the dataset. 
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Superconductivity in an infinite-layer nickelate 


Danfeng Li)?*, Kyuho Lee!?, Bai Yang Wang!, Motoki Osada!*, Samuel Crossley)?, Hye Ryoung Lee!*, Yi Cui!+, 


Yasuyuki Hikita! & Harold Y. Hwang!* 


The discovery of unconventional superconductivity in (La,Ba),CuO4 
(ref. ') has motivated the study of compounds with similar crystal 
and electronic structure, with the aim of finding additional 
superconductors and understanding the origins of copper oxide 
superconductivity. Isostructural examples include bulk 
superconducting Sr,RuO, (ref. ”) and surface-electron-doped 
Sr2IrO4, which exhibits spectroscopic signatures consistent with a 
superconducting gap*", although a zero-resistance state has not yet 
been observed. This approach has also led to the theoretical 
investigation of nickelates®®, as well as thin-film heterostructures 
designed to host superconductivity. One such structure is 
the LaAl1O;/LaNiO; superlattice”~°, which has been recently 
proposed for the creation of an artificially layered nickelate 
heterostructure with a singly occupied d ,2_ ,2band. The absence of 
superconductivity observed in previous related experiments has 
been attributed, at least in part, to incomplete polarization of the e, 
orbitals’”. Here we report the observation of superconductivity in 
an infinite-layer nickelate that is isostructural to infinite-layer 
copper oxides!!-!°, Using soft-chemistry topotactic reduction!*”°, 
NdNiO, and Ndo sSro,2NiO> single-crystal thin films are synthesized 
by reducing the perovskite precursor phase. Whereas NdNiO2 
exhibits a resistive upturn at low temperature, measurements of the 
resistivity, critical current density and magnetic-field response of 
Ndo.sSro.2NiO>2 indicate a superconducting transition temperature 
of about 9 to 15 kelvin. Because this compound is a member of 
a series of reduced layered nickelate crystal structures”!~*’, these 
results suggest the possibility of a family of nickelate supercon- 
ductors analogous to copper oxides” and pnictides”. 

The most stable nickelates have a formal valence of Ni?* and a d® 
electronic configuration, such as in NiO and La,NiO,, but they can also 
form with d’ Ni**, as in LaNiO3. Mimicking the d? configuration of 
undoped copper oxides requires the highly unusual valence Ni*. 
Although this oxidation state cannot be reached by conventional 
high-temperature synthesis, it was found that low-temperature reduc- 
tion of LaNiO3 can induce a topochemical reaction to form LaNiO,!*». 
(In general, slight oxygen off-stoichiometry is possible, but for simplic- 
ity we use the stoichiometric formula throughout this manuscript.) 
Subsequently, it was shown that this oxygen deintercalation also occurs 
in epitaxial thin films, with the useful feature that the substrate can 
provide a template that preserves single-crystal c-axis-oriented LaNiOz 
(Fig. 1) in the vicinity of the substrate’”~’*. In this structure, nickel has 
square planar oxygen coordination in two-dimensional NiO; planes 
(alternating with planes of La), with a predicted d° configuration leav- 
ing one hole in thed,2_ orbital and therefore a possible distinct orbital 
polarization. Indeed such large preferential orbital occupancy near the 
Fermi level has been observed in the related trilayer reduced nickelate 
LayNi3O3 (nominally Ni!*?*, d§°7)3, 

In preliminary work, we first synthesized LaNiO; thin films on single- 
crystal SrTiO; (001) substrates by pulsed-laser deposition, followed 
by a reduction step using CaH2 powder as a reagent (see Methods). 
Whereas LaNiO; was metallic down to low temperatures, LaNiO2 was 
weakly insulating, consistent with previous reports!”!*, Given that 


perovskite nickelates can be doped by chemical substitution on the 
rare-earth site*°?’, we then explored reduced La;_,Sr,NiO> thin films 
as an approach to hole-dope the parent compound. Although the con- 
ductivity was enhanced (maximally for x = 0.2; data not shown), in all 
cases the resistivity exhibited insulating temperature dependence below 
about 150 K. Although this result should not be considered definitive 
(it may depend on further optimization of the growth conditions and 
reduction process), we then turned to NdNiO, in an attempt to increase 
the electronic bandwidth via the smaller ionic radius of Nd with respect 
to La, which results in a smaller cell volume!*"*. This tendency has been 
observed in trilayer reduced nickelates, where LayNi3Os is insulating, 
whereas Pr4Ni3Og is metallic down to low temperature”*. With these 
motivations, we focused our efforts on optimizing and investigating 
NdNiOz and Ndo.sSro2NiO2, which we present in detail here. 

Bulk NdNiO3 is orthorhombic with room-temperature lattice param- 
eters a =5.39 A, b=5.38 A andc=7.61 A (a pseudocubic lattice param- 
eter of about 3.81 A), and doping with Sr has no substantial influence on 
its room-temperature structure and lattice constants””. NdNiO reduced 
from NdNiO; has been previously synthesized in both polycrystalline'® 
and thin-film form”®, and was reasonably straightforward to grow. 
By contrast, we found that the synthesis of thin-film Ndo gSro 2NiO3 
is more challenging—presumably because of the high Ni oxidation 
state and reduced tolerance factor compared to LaNiOs; (ref. 78). 
Figure 2a shows an X-ray diffraction (XRD) 9-26 symmetric scan of 
a Ndo.gSro.2NiO3 film grown under conditions optimized using XRD, 
revealing only clear (001) and (002) perovskite peaks (see Methods; 26, 
diffraction angle). (Throughout much of this work we used a SrTiO3 
epitaxial capping layer to protect the reduced nickelate films from 
potential degradation, unless otherwise noted.) From their positions, 
the c-axis lattice constant was extracted to be 3.77 A, in line with a 
film under epitaxial tensile strain imposed by the SrTiO; substrate. 
Figure 2a also shows the 6-26 diffraction pattern of the film after reduc- 
tion, showing peaks with 26 values of 26.3° and 54.3° corresponding 
to the (001) and (002) peaks of the infinite-layer phase, respectively, 
confirming the transformation to Ndo,gSto 2NiO> (refs '*'7). In meas- 
urements with diffraction angles up to 20 = 114° (not shown) the (003) 
film peak is not visible owing to its low intensity, whereas the (004) peak 
falls beyond the diffractometer limit, and no other peaks are observed. 
Both before and after reduction, the film was always clamped to the 
in-plane SrTiO; lattice (Fig. 2b, c). 

In bulk undoped NdNiOs, reducing the perovskite to the infinite- 
layer phase (reported for NdNiO2 3) leads to an expansion of the 
in-plane lattice constants (about 3.92 A), along with a shortened c axis 
(about 3.31 A)!°, which is the distance between adjacent Ni-O planes. 
From the (001) peak positions of the Ndp Srp 2NiO> film, the c-axis lat- 
tice constant is found to be 3.37 A, and it ranges from 3.34 A to 3.38 A in 
samples prepared in nominally similar conditions. The film experiences 
compressive strain on the SrTiO; (3.91 A) substrate, as well as potential 
c-axis expansion due to the partial substitution of Nd by the larger 
Sr ion. We note, however, that the metallic nature of Ndo.gSro.2NiO2 
counteracts these trends. No signature of a fluorite defect phase”° was 
observed in asymmetric 9-20 XRD scans of our samples (both doped 
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Fig. 1 | Topotactic reduction of nickelate thin films. Schematic crystal 
structures of Ndo.sSro.2NiO; (left) and Ndo gSro.2NiO> (right) thin films 
on the TiO3-terminated single-crystal SrTiO3 (001) substrate. Upon low- 
temperature reduction, the films undergo a topotactic transition from the 
perovskite phase to the infinite-layer phase. 


and undoped). For thin-film LaNiOs, reduction induces a series of 
transformation steps: first to brownmillerite LaNiO; 5, then to c-axis 
LaNiO,, followed by a reorientation transition to a-axis LaNiO,, before 
subsequent decomposition”’. For NdNiO3; and Ndp sSro.2NiOs, we only 
observe a direct transition to the c-axis infinite-layer structure (Fig. 1). 
Our annealing conditions (see Methods) are empirically optimized to 
maximize the XRD infinite-layer peak intensity and minimize the 
c-axis lattice constant (as a proxy for the removal of apical oxygen). 
The comparable (002) peak intensities for the perovskite and infinite- 
layer phases (Fig. 2a), as well as the thickness fringes observed near 
(002) after reduction, indicate a complete structural transformation 
of the film. Reduction for much longer times or at higher temperature 
induces decomposition of the film, and no XRD features are observed. 
Figure 3a shows the temperature-dependent resistivity p(T) of 
NdNiO3 and Ndo.gSro.2NiO3. NdNiO3 shows the characteristic first- 
order phase transition from a high-temperature paramagnetic metal toa 
low-temperature charge-disproportionated antiferromagnetic insulator, 
which is suppressed with Sr doping”*’”. After reduction (Fig. 3b), we 
find that NdNiO, displays metallic temperature dependence at high 
temperatures, with a resistive upturn below about 70 K. By contrast, 
Ndo.gSro2NiO> exhibits metallic behaviour followed by a superconduct- 
ing transition, with an onset at 14.9 K (point of maximum curvature), 
a midpoint at 13.6 K and zero resistance at 9.1 K (indistinguishable 
from the noise floor) for this sample. The temperature-dependent 
normal-state Hall coefficient Ry(T) is given in Fig. 3c. Ry for NdNiO, 
is negative at all temperatures, whereas it undergoes a sign change at 
about 55 K for Ndo.gSro.2NiO>. This feature, as well as the overall mag- 
nitude of Ry, are inconsistent with the expectations for simple hole 
doping of a single electronic band, and suggest a more complex Fermi 
surface. This may be consistent with calculations of the electronic band 
structure of LaNiOs, which find multiple electron and hole pockets that 
have different orbital contributions® and that vary with the Coulomb 
interaction. We further note that the interface between the infinite- 
layer nickelate and the SrTiO; substrate (Fig. 1) hosts a strong polar 
discontinuity*’. Depending on how this electrostatic boundary condi- 
tion is resolved, there may be transport contributions from interface 
states. However, the comparison between NdNiO) and Ndo Srp 2NiOz 
demonstrates that this alone does not lead to superconductivity here. 
The observation of superconductivity is quite robust. In Fig. 3d, e 
we show a number of different samples of Ndo sSro,2NiO2 synthe- 
sized in nominally similar conditions. The origin of the variation in 
transition temperature (T.) is unclear, but there are some indications 
that it correlates with the crystallinity of the parent perovskite phase 
and may also reflect slight variations in the oxygen stoichiometry. 
In Figs. 3f, 4 we focus on one sample (Fig. 3b) with a high transition 
temperature; all other samples showed similar behaviour as scaled 
by T.. Figure 3f shows measurements of the temperature-dependent 
current-voltage characteristics for this sample. These features are 
linear in the normal state (outside nonlinearities due to Joule heating 


LETTER 


SrTiO, (002) 


F SrTiO, (001) 


| Ndp gSto pNiO, (002) 


Intensity (a.u.) 


Nd gSto pNiO, (001) 
| Ndo Sto, 
| 


ae 


L | SrTiO, (002) 
s | SrTiO, (001) | 
SE | 
er u (111) | Nd, 2Stp,aNiO, (002) 
no \ 
& E Nd, gStq 2NiO, ( (001) * Oe ae 
8 me yee i 
b 58, c 58, ; 76. 
5.7L 5.7L NdoeSto.2 NiO, (103) 08 4 
5.64 5.64 06 2 
5.5L 5.5 04 5 
02 = 
5.44 5.44 = 
_ 2 0.0 8 
= 53h 7 S5.OF 
< ea Nd) Sto 9 NiO, (103) S sol | 
fo} 3 | 
fm Go 51+ 


log[intensity (a.u.)] 
log[intensity (a.u.)] 


, 1.0 
: 0.8 
: 0.6 
: 0.4 
7} SrTiO, (103) 0.2 

0.0 


1.4 1.5 1.6 1.7 1.8 
“1 
Qi100 A) 


Fig. 2 | Structural characterization of the doped nickelate thin films. 
a, X-ray diffraction 0-20 symmetric scans of 11-nm-thick Ndo gSro.2NiO3 
(top) and Ndo Srp. 2NiO, (bottom; with contribution from gold contacts) 
films capped with 20-nm-thick SrTiO; layers grown on SrTiO; (001) 
substrates. b, c, Reciprocal space maps of Ndo.gSro.2NiO3 (b) and 
Ndo.gSro.2NiO> (c) around the (103) SrTiO; diffraction peak. Both maps 
indicate that the films are fully strained to the SrTiO; substrates. a.u., 
arbitrary units. 
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at high bias) and increasingly nonlinear below the transition, and they 
are characteristic of superconductivity with a critical current density 
Je(2 K) © 170 kA cm”. 

Figure 4a displays the temperature-dependent magnetoresistance 
measured in magnetic fields perpendicular to the plane of the sample, 
up to 13 T. The normal state exhibits very little magnetoresistance, 
whereas superconductivity is suppressed with increasing field. As a 
proxy for the variation of the upper critical field H., we take the mid- 
point of the resistive transition to the normal state near T. and fit it to 
the linearized Ginzburg-Landau form 


fol" 
1 

ang) 7 
where @p is the flux quantum and &g1(0) is the extrapolated zero- 
temperature Ginzburg-Landau coherence length, which we find to be 
3.25 + 0.01 nm. (This estimate does not consider potential contribu- 
tions from vortex motion or variations due to sample inhomogeneity.) 
We further perform two-coil mutual-inductance measurements in the 
perpendicular geometry, as shown in Fig. 4b. Here we plot the real 
(Re(V,)) and imaginary (Im(V,)) components of the a.c. voltage signal 
detected by the pickup coil above the sample. As the sample is cooled 
through the transition, Re(V,) decreases while Im(V,) exhibits a peak, 
indicating an emergent diamagnetic response below the transition as 
the magnetic field generated from the drive coil becomes screened by 
the superconductor. The fact that Re(V,) does not approach zero at low 
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Fig. 3 | Transport properties and superconductivity of the nickelate thin 
films. a, Resistivity versus temperature p(T) plots of the as-grown NdNiO3 
and Ndp gSro.2NiO; films. b, c, Resistivity (b) and normal-state Hall 
coefficient (c) as a function of temperature for the corresponding reduced 
films (NdNiO, and Ndo.gSto,2NiOz). d, e, p(T) for multiple Ndo.gSto.2NiO2 


temperatures resembles measurement results of a 40-nm-thick infinite- 
layer copper oxide film with T- ~ 10.8 K and extrapolated London 
penetration depth A,(T = 0) = 2.2 um (ref. 31). This indicates that \; 
for Ndg.gSro,2NiO> is similarly large compared to the film thickness. 
Given the numerical uncertainties arising from the finite sample size 
(substantially wider films show indications of laterally inhomogeneous 
reduction), the order parameter symmetry and the scale of disorder, we 
did not attempt to extract ; (ref. >”). Nevertheless, these data suggest 
that this is a type-II superconductor with second critical field Ha, 
approximately given in the inset to Fig. 4a. 

Clearly the analogy to copper oxides motivated this finding, and 
much remains to be explored in this new superconducting compound. 
However, several important dissimilarities between these two systems 
are apparent. One key difference is the energy level alignments in their 
orbital electronic structure. Holes in copper oxides are often discussed 
in terms of Zhang-Rice singlets with strong oxygen character, owing 
to the close spatial overlap and near-energetic degeneracy of the Cu 
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Fig. 4 | Magnetic-field response of superconducting Ndo sSro.2NiO>. 

a, p(T) under a varying magnetic field perpendicular to the a—b plane. 

The inset shows the variation of the upper critical field H, | (as estimated 

by the midpoint of the resistive transition) with a linear fit in the vicinity 
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films, showing resistive superconducting transitions. Dotted lines indicate 
samples without a capping layer, for which the XRD Scherrer thickness 
was used to estimate the resistivity. f, Electric field (E) versus current 
density (J) characteristics for varying temperature. 


d,2_,2 orbitals and the O 2p orbitals?*. This naturally leads to large 
in-plane antiferromagnetic coupling, which many consider to be cen- 
tral for superconducting pairing’*. Because Nit is one column to the 
left of Cu** on the periodic table and one oxidation state lower, the 
chemical potential in the infinite-layer nickelates is several electronvolts 
higher than that of comparable copper oxides; therefore, in hole-doped 
nickelates, much less hybridization with the O 2p band is expected’. 
Furthermore, powder neutron diffraction studies of LaNiO, and 
NdNiO; show no indication of magnetic order down to 5 K and 1.7 K, 
respectively'>'%, and the resistivity of NdNiO> (Fig. 3b) is inconsistent 
with a robust insulator (although interface effects may contribute to 
conductivity). Consequently, two features that are central to copper 
oxides—the Zhang-Rice singlet and large planar spin fluctuations— 
may be absent (or considerably diminished) in these nickelate 
superconductors. 

On the materials side, one immediate question is the effect of 
the various substrates on the topotactic structural transition of this 
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of T.. b, The real (Re(V,)) and imaginary (Im(V,)) parts of the voltage 

as a function of temperature in the pickup coil on a Ndo gSro.2NiO> film, 
measured using a two-coil mutual-inductance measurement. /19, magnetic 
constant. 


system and the associated dependence of superconductivity on epi- 
taxial strain. Here we have an unusual situation in which the substrate 
that stabilizes the phase also strains it. Another important question 
is whether there is a doping-dependent superconducting dome, as 
found in copper oxides”*. We believe that our approach to chemical 
substitution is broadly applicable and can address this issue, but the 
central challenge will be whether complex reduction chemistry can 
be homogeneously controlled across a range of unconventional nickel 
oxidation states. 
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METHODS 

Film growth. TiO2-terminated SrTiO; (001) substrates of size 5 x 5 mm? were 
pre-annealed at an oxygen partial pressure Por = 5 x 10” torr for 30 min at 950°C 
to achieve sharp step-and-terrace surfaces. 9-11-nm-thick perovskite NdNiO3 
and Ndo gSro.2NiO; films were grown on the annealed substrates by pulsed-laser 
deposition using a 248-nm KrF excimer laser. This thickness was chosen because 
it was approximately equal to the maximum thickness for which we could verify 
the formation of a uniform, single-phase infinite-layer film after reduction using 
XRD. NdNiO3 (Ndo sSro.2NiO3) films were deposited at a substrate temperature 
of Tz = 600°C and Po2 = 150 mtorr, using a laser fluence of 2 J cm? on the 
target. Subsequently, SrTiO; epitaxial capping layers (typically 20 nm thick) were 
deposited at T, = 570°C and the same Pop, using a laser fluence of 0.8 J cm~?. 
After growth, the samples were cooled to room temperature in the same oxygen 
environment. The nickelate targets were prepared by sintering mixtures of stoi- 
chiometric amounts of Nd,O3, SrCO3 and NiO powder at 1,350°C for 12 h, with 
two intermediate grinding and pelletizing steps after the initial decarbonation 
step at 1,200°C for 12h. 

Reduction process. After growth, each sample was cut into two pieces of size 
2.5 x 5mm”. Each piece (loosely wrapped in aluminium foil) was then vacuum- 
sealed together with about 0.1 g of CaH powder in a Pyrex glass tube (pressure 
<0.1 mtorr). In this way, the pieces were not in direct contact with the CaH2 pow- 
der'*-'8 but underwent a gas-phase reaction with the powder upon annealing. The 
tube was heated to 260-280 °C at a rate of 10°C min~! and kept at this temperature 
for 4-6 h; then it was cooled to room temperature at a rate of 10°C min7!. 
Characterization. The XRD data were taken using a monochromated Cu Ka; 
source. The resistivity, magnetotransport and current-voltage characteristic 
measurements were conducted in a six-point geometry using Au and Al 
wire-bonded contacts. In some cases, Au contact pads were first deposited using 
electron-beam evaporation. Critical-current density-voltage measurements were 
performed on a narrow channel defined by a diamond scribe, approximately 
0.2 mm wide. 

Mutual-inductance measurements. The Ndo Srp 2NiO» samples were placed 
tightly between two collinear coils, the mutual inductance of which was sensitive 
to diamagnetic screening of the sample in the superconducting phase. The twin 


80-turn coils had inner diameter of about 0.5 mm and outer diameter of around 
1.5 mm, yielding a measured self-inductance of about 6 \tH. The drive coil was 
driven with an alternating current of root-mean-square amplitude of 100 1A and 
frequency of 15 kHz. The in-phase and out-of-phase components of the voltage 
across the pickup coil (in the microvolt and submicrovolt range, respectively) were 
measured by lock-in amplification. The measured voltage was in a regime of linear 
response with respect to the amplitude of the drive current. 


Data availability 
The data presented in the figures and other findings of this study are available from 
the corresponding authors upon reasonable request. 
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Thermal conductance of single-molecule junctions 


Longji Cuib*®, Sunghoon Hur!, Zico Alaia Akbar’, Jan C. Kléckner**, Wonho Jeong!, Fabian Pauly**, Sung-Yeon Jang 


Pramod Reddy!5* & Edgar Meyhofer!* 


Single-molecule junctions have been extensively used to probe 
properties as diverse as electrical conduction’, light emission‘, 
thermoelectric energy conversion®®, quantum interference”*, 
heat dissipation”! and electronic noise!! at atomic and molecular 
scales. However, a key quantity of current interest—the thermal 
conductance of single-molecule junctions—has not yet been directly 
experimentally determined, owing to the challenge of detecting 
minute heat currents at the picowatt level. Here we show that 
picowatt-resolution scanning probes previously developed to study 
the thermal conductance of single-metal-atom junctions'”, when 
used in conjunction with a time-averaging measurement scheme 
to increase the signal-to-noise ratio, also allow quantification 
of the much lower thermal conductance of single-molecule 
junctions. Our experiments on prototypical Au-alkanedithiol-Au 
junctions containing two to ten carbon atoms confirm that thermal 


2,7 
, 


conductance is to a first approximation independent of molecular 
length, consistent with detailed ab initio simulations. We anticipate 
that our approach will enable systematic exploration of thermal 
transport in many other one-dimensional systems, such as short 
molecules and polymer chains, for which computational predictions 
of thermal conductance'*'° have remained experimentally 
inaccessible. 

Studies of charge and heat transport in molecules are of great 
fundamental interest, and are of critical importance for the development 
of a variety of technologies, including molecular electronics”, ther- 
mally conductive polymers'® and thermoelectric energy-conversion 
devices!”. Given this overall importance and the daunting experimental 
challenges, a number of initial studies explored charge transport in 
ensembles of molecules”””!. Although such measurements provided 
important insights, researchers gradually began to realize that it was 
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Fig. 1 | Experimental set-up and strategy for quantifying heat transport 
in single-molecule junctions. a, Schematic of the calorimetric scanning 
thermal microscopy (C-SThM) set-up. Right, a single molecule is trapped 
between an Au-coated tip of the C-SThM probe, which features “T’-shaped 
silicon nitride (SiN,.) beams and is heated to temperature Tp by input of a 
heat current (Q) via an embedded serpentine Pt heater-thermometer, and 
an Au substrate at temperature Ts that is equal to ambient temperature 
(Tamb). The thermal conductance of single-molecule junctions is quantified 
by recording the temperature change of the Pt heater-thermometer when 

a single-molecule junction is broken. Left, resistance network capturing 
the thermal resistances of the molecular junction (Rswy = 1/Gin,smy) and 
the scanning probe (Rp = 1/Gin,p). b, Schematics of the alkanedithiol 


Au-coated tip 


molecules (Cn) studied in this work; n = 2, 4, 6, 8, 10 denotes the number 
of carbon atoms in the molecules (red, carbon atom; grey, hydrogen atom; 
blue, sulphur atom). c, Magnified view of ringed area in a, describing 

the trapping of a single C6 molecule between the heated Au STM tip 

and the cold Au substrate. d, Scanning electron microscope image (false 
coloured to highlight the Pt heater-thermometer) of a custom-fabricated 
C-SThM probe (which shows the tip end), featuring two long “T’-shaped 
SiN, beams (see beam cross-section shown ringed in a) and a serpentine 
Pt heater-thermometer integrated on a suspended micro-island. The 
electrical resistance of the Pt heater-thermometer is monitored by 
measuring the voltage output (Vout) in the presence of an input d.c. 
current (Jin). 
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Fig. 2 | Measurement of electrical and thermal conductance of 
Au-C6-Au single-molecule junctions. a, Main panel, histogram (shown 
in teal) of the electrical conductance of Au-C6-Au junctions obtained 
from approximately 500 independent traces of electrical conductance 
versus displacement. Inset, representative traces of the electrical 
conductance for four independent measurements. A Gaussian fit to the 
histogram peak is represented by the solid red line. b, Experimental 
protocol for measuring the thermal conductance of a single-Cé junction 
(see Methods for details). Upper panel, the electrical conductance trace 
indicates rupture (at time t = ,) of a single-molecule junction by a sudden 
drop of the measured G,; value. Lower panel, the coincident thermal 
conductance change (AG, left axis) and the related temperature change 


necessary to develop single-molecule measurement techniques’*>”” 


to avoid the confounding effects of ensemble measurements—includ- 
ing uncertainties in the actual number of molecules contributing to 
transport through the junctions and the effects of intermolecular 
interactions—and to study systematically the electrical conduction 
properties of genuine single-molecule entities. Corresponding efforts 
over the past decade have been made to experimentally character- 
ize heat transport in ensembles of molecules such as self-assembled 
monolayers”* 4 and polymer nanofibres'®”°. Not surprisingly, these 
thermal ensemble measurements face challenges and uncertainties 
similar to those found in previous monolayer electrical measurements, 
and intermolecular interactions are expected to have an influence on 
the thermal transport properties of molecular junctions'®”*. Although 
recent experimental advances!” have enabled heat transport studies 
in metallic single-atom junctions (where thermal conductances are in 
the region of 500 pW K~’), similar endeavours for single-molecule 
junctions—where contributions to heat transport by electrons are 
negligible and heat flow is instead dominated by phonons resulting in 
low thermal conductance values (tens of picowatts per kelvin)—have 
remained unattainable owing to experimental challenges in detect- 
ing such small conductances. This situation is especially frustrating 
as computations have predicted several interesting thermal transport 
properties in one-dimensional molecular and polymer junctions'-"*. 

The first measurements of thermal transport in single-mole- 
cule junctions we report here are enabled by our custom-developed 
scanning probe technique, called calorimetric scanning thermal 
microscopy (C-SThM)"”, which has excellent mechanical stability and 
ultra-high thermal sensitivity. The nanofabricated C-SThM probes 
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of the probe (ATp, right axis), where the small effects of Joule heating 

are already accounted for (see Methods). It can be seen that, unlike 

the clearly identifiable electrical conductance change associated with 

the breaking of the junction, the corresponding thermal conductance 
change is not discernible in the noisy signal. c, An improved signal- 
to-noise ratio is obtained upon aligning via G. and averaging multiple 
thermal conductance traces. Gin,smy, indicated by the drop in the thermal 
conductance signal after 0.5 s, can be seen after averaging 50 traces and is 
about 18 pW K~! for Au-C6-Au single-molecule junctions. The coloured 
regions in b and c with their insets indicate the discernible pre- and 
post-rupture portions of the recorded and averaged traces. 


(see Extended Data Fig. 1 for the detailed fabrication process) feature a 
suspended micro-island supported by two thin, long “T’-shaped silicon 
nitride (SiN,) beams with both very high stiffness (>10*N m7 in the 
normal direction, see Methods) and very small thermal conductance 
(Gin,p & 800 nW K~}; here and elsewhere, subscript P indicates that 
a property of the probe is being given). A platinum (Pt) resistor of 
serpentine geometry is embedded into the micro-island and serves 
as both a heater and a highly sensitive resistance thermometer. When 
combined with the time-averaging scheme described below and 
in Methods, it reaches a temperature resolution of about 0.1 mK that 
enables us to detect heat currents with a resolution of approximately 
80 pW, or thermal conductance with a resolution of about 2 pW K7! 
root mean square (see Methods). 

Figure la depicts the experimental set-up and the basic strategy 
for quantifying thermal conductance at the single-molecule level. 
The C-SThM probe, located in an ultra-high-vacuum (UHV) envi- 
ronment, is heated above ambient to a temperature Tp, typically 
320-340 K, by supplying a constant electric current (about 30-40 |1A) 
to the serpentine Pt resistor. The Au substrate, located in the same 
UHV environment, is connected to a thermal reservoir maintained at 
ambient room temperature T's = 295 K (S indicates the Au substrate). 
The planar surface of the Au substrate is coated with a self-assembled 
monolayer of prototypical thiol-terminated alkane molecules that are 
widely regarded as a model system and have been extensively explored 
computationally'*!°. We first create molecular junctions by displacing 
the Au-coated scanning probe tip at a constant speed via piezoelectric 
actuation towards the Au substrate until contact is made between the 
two Au electrodes. With a voltage bias applied between the Au tip and 
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Fig. 3 | Length-dependent electrical and thermal transport in 
Au-alkanedithiol-Au single-molecule junctions. a, Measured electrical 
conductance histograms for different alkanedithiol junctions (C2 to C10; 
see key). Red lines represent the Gaussian fit of the histogram peaks. 

b, Electrical conductance and thermal conductance traces of single- 
alkanedithiol junctions obtained by averaging >100 traces for C2 

(155 traces), C4 (133 traces), C8 (110 traces) and C10 (108 traces) 
junctions following the experimental protocol described in Fig. 2b. 

c, Measured electrical (blue diamonds, right axis) and thermal 


the substrate, contact is signalled by a large, predictable increase in 
the recorded electrical conductance (see Methods) and the probe is 
then withdrawn slowly from the substrate at a speed of 0.05 nm s~’. 
During withdrawal, molecules trapped between the tip of the scan- 
ning probe and the Au substrate break away from either the substrate 
or the tip until the last molecular junction is broken. Throughout 
the tip withdrawal from the substrate, we continuously monitor both 
the electrical current through the junction for a fixed voltage bias and 
the temperature of the probe. As detailed below, we use the measured 
electrical currents to determine the corresponding electrical conduct- 
ance and thereby identify single-molecule trapping events, and infer from 
the measured temperature change of the probe (ATp) the thermal 
conductance of single-molecule junctions (Gipsy). 

We first trap molecules of 1,6-hexanedithiol (C6) between the 
Au-coated tip of the C-SThM probe and the Au substrate. Figure 2a 
shows representative electrical conductance-distance traces 
obtained by repeatedly displacing the tip away from the substrate, 
and the electrical conductance histogram constructed from about 
500 independently measured traces. The histogram features a 
pronounced conductance peak at about 5.1 x 10~*Gp (electrical 
conductance quantum, Go = 2e?/h ~ 77.5 uS), indicating the most 
probable low-bias conductance of a single-molecule junction. This 
value is interpreted as the electrical conductance of a single C6 mol- 
ecule bridging the Au electrodes, and is in good agreement with pre- 
vious work’®, 

To probe the thermal conductance of single-molecule junctions, 
we stop the tip withdrawal process when the electrical conductance 
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Time (s) 


conductance (red triangles, left axis) as a function of the molecular length, 
as given by the number of CH) units in the alkanedithiol junctions. The 
solid blue line indicates a linear fit to the electrical conductance data on a 
logarithmic scale. The measured thermal conductance data are fitted by a 
linear curve (green line) on a linear scale, with the region shaded in light 
green representing the 95% confidence band. Error bars represent one 
standard deviation of the data obtained from three sets of measurements 
for each molecule. 


of Au-C6-Au junctions is close to (within one standard deviation 
around the Gaussian-fitted histogram peak) the most probable low-bias 
conductance, and monitor the electrical current and temperature of the 
probe until the molecular junction spontaneously breaks. The top panel 
in Fig. 2b shows a typical electrical conductance trace measured for an 
Au-C6—Au single-molecule junction, showing how the electrical con- 
ductance suddenly drops within a few milliseconds (the time constant 
of the electrical measurements) when the molecular junction breaks. As 
the Joule heating is small for Au-C6-Au junctions (see Methods) and 
breaking removes the thermal conduction pathway through the molec- 
ular junction, we expect a small temperature rise in the probe (ATp) 
immediately after the junction is broken. This temperature change can 
be related to the change in the thermal conductance of the junction 
(AG), that is, the thermal conductance of a single-molecule junction 
(Gin,smy), Via Gin,swy = —AGin © Ginp ATp/(Tp — Ts) (see Methods), 
where Gi,p is the thermal conductance of the probe and Tp — Ts is 
the temperature difference between the probe and the substrate. The 
bottom panel of Fig. 2b presents the measured temperature change of 
the probe (right y-axis), from which the thermal conductance, AGi, 
(left y-axis), can be directly determined. (See Methods for a description 
of how the measured temperature change of the probe is processed and 
how the effects of Joule heating, which are small in this case but increase 
for shorter molecules, are systematically accounted for.) 

Because the thermal conductance of the Au-C6-Au single- 
molecule junction is small relative to the noise present in AT», preventing 
reliable direct detection of changes in thermal conductance, we applied 
an averaging scheme to improve the signal-to-noise ratio of the thermal 
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Fig. 4 | First-principles calculations of the thermal transport through 
alkanedithiol single-molecule junctions. a, Calculated thermal 
conductance as a function of electrode displacement for an Au-C6-Au 
single-molecule junction. Different regions of junction stretching are 
distinguished by differently coloured backgrounds, and characterized 
by the representative geometries shown as insets: plateau (yellow), decay 
(green), pulled-out gold atoms (blue) and broken junction (white). 

b, Thermal conductance as a function of electrode displacement for C2, 
C4, C8 and C10. c, Mean thermal conductance as calculated from the 
green area (AG, green triangles) and the blue area (AB, blue triangles). 
Error bars show maximum and minimum thermal conductances in the 


measurements and resolve AG. In brief, we performed many meas- 
urements (hundreds) following the protocol described above, and 
first used the electrical conductance versus time traces to identify the 
time point tf, when the single-molecule junction breaks (f, = 0.5 s 
in Fig. 2b). Using the electrical signal as a reference, thermal signals 
were then aligned and averaged (see Methods). As the averages over 
20, 50, 100 and 300 traces in Fig. 2c illustrate, averaging suppresses 
noise and reveals a clear thermal conductance change (AG,,) that coin- 
cides with the electrical conductance change caused by the breaking of 
the single-molecule junction. This approach reveals a change in the 
conductance of about 18 pW K™', which represents the thermal 
conductance of the Au-C6-Au single-molecule junction (Gipsy). In 
contrast to the rapid transition of the electrical signal on the breaking 
of the junction, the roll-off of the thermal conductance is much slower 
because it is limited by the thermal time constant of the scanning probe 
(about 25 ms). 

The ability to resolve the thermal conductance at the single-molecule 
scale offers unique opportunities to address fundamental questions!*> 
with regard to how thermal transport in single-molecule junctions 
depends on molecular characteristics. We illustrate this with addi- 
tional thermal transport measurements on a series of alkanedithiol 
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respective coloured regions. Estimates (EST, open diamonds) for the 
thermal conductance of C2 and C4 are obtained by taking the electronic 
contribution into account via the Wiedemann-Franz law and adding it 
to the corresponding average of the AB data points. The experimental 
data from Fig. 3c is also shown (red triangles and red error bars) to 
facilitate comparison between experiment and computations. d, Phonon 
transmission as a function of energy for C2, C6 and C10 junctions. 

The respective junction geometry, for which the transmission plot is 
performed, is indicated by an arrow in the corresponding trace in a 

or b. In each case the first geometry in the green region was selected. 


molecules differing in the number of CH) units (from 2 to 10, with 
these molecules referred to as C2 to C10, respectively), to explore the 
influence of molecular length. Figure 3a shows the measured electrical 
conductance histograms for the studied molecules, with the Gaussian- 
fitted peak values summarized in Fig. 3c. The data document an expo- 
nential decay of the electrical conductance (G,1) of single-alkanedithiol 
junctions with increasing molecular length (L), indicating tunnel- 
ling-dominated electron transport. We extract a tunnelling decay 
constant (3, where Gi/Gp x e~*") of 0.92 + 0.05 per CH, unit, which 
agrees well with past work’. The measured thermal conductance of the 
single-molecule junctions containing C2 to C10 is shown in Fig. 3b, 
and the summary of the thermal conductance values is included in 
Fig. 3c. We note that, for all molecular junctions, the effect of Joule 
heating is systematically accounted for (see Methods). In strong 
contrast to the measured length-dependent electrical conductance, the 
thermal conductance of the single-alkanedithiol junctions exhibits a 
nearly length-independent behaviour with a value of approximately 
20 pW K™|, suggesting that thermal transport in single-molecule 
junctions is ballistic. 

To elucidate the microscopic origin of our observations, we use the 
Landauer-Biittiker formalism for coherent transport!®”? (see Methods). 
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Combining non-equilibrium Green’s function techniques with 
density functional theory (DFT) in a custom-developed code*”?!, we 
compute, ab initio and thus without recourse to free parameters, the 
thermal conductance due to phonons for individual junction geom- 
etries matching the various alkane chain lengths and conditions 
used in our measurements. Figure 4a shows the computed thermal 
conductance data for a C6 single-molecule junction as a function of 
electrode displacement, with the conductance-distance trace divided into 
different stages. The first stage, shaded in yellow, corresponds to a 
plateau as the molecule rotates slightly upon stretching with little 
change to the thermal conductance. In the second stage, shaded in 
green, the thermal conductance decreases as the junction is elongated 
due to S-Au bond stretching and reconfigurations in the Au electrodes, 
which give rise to decreased metal-molecule coupling. Before the con- 
tact breaks, a third stage occurs, shaded in blue, where gold atoms are 
further pulled out of the electrodes and short atomic dimer chains 
form. This behaviour is well known in the context of atomic force stud- 
ies*” and typically leads to a small additional reduction of the phonon 
thermal conductance. In Fig. 4b we depict corresponding traces for 
C2, C4, C8 and C10 junctions, all of which feature similar regions. 
In all cases, the junction breaks owing to the rupture of an Au-Au bond 
(see also Extended Data Fig. 5). 

The experimental data represent the thermal conductance 
at the point where the contact breaks, so we calculate for C2 to 
C10 the thermal conductance values AG and AB that are the average 
over the stretched junctions in the green- and blue-shaded regions in 
Fig. 4a, b, respectively, and compare these in Fig. 4c against the meas- 
ured thermal conductance values. The computed AB and AG values lie 
in the range 16-21 pW K~' and 22-33 pW K", respectively, and agree 
well with the measured data. Further, we observe that the computed 
phononic contribution to the thermal conductance of the junctions 
is nearly independent of molecular length. For completeness, we also 
estimate the electronic contribution to the thermal conductance using 
the measured electrical conductance (Fig. 3a) and the Wiedemann- 
Franz law (see Methods for more details). We find that the electronic 
contribution is about 5.7 pW K~! and 1.1 pW K7! for C2 and C4, 
respectively, while it is negligible for all other molecules. We have added 
these values (indicated by open diamonds) to the AB data in Fig. 4c. 
On the whole, the theoretically determined thermal conductance val- 
ues are in good agreement with the experimental data. We note that in 
contrast to previous studies'°, where thermal transport was calculated 
for single-molecule junctions under minimal tension, our analysis 
here includes the effect of stretching and reveals a lower thermal 
conductance when a junction is close to rupture. 

To understand further how heat is transported through single-molecule 
junctions, we show in Fig. 4d the computed energy-dependent trans- 
mission function 7p,(E) for C2, C6 and C10 junctions. The function 
quantifies the probability of elastic phonon transmission at a specific 
energy from one electrode to the other via the bridging molecule. 
Owing to coupling to the continuous modes of the metal electrodes, 
Tph(E) shows broad resonances with positions and widths that depend 
on the precise contact geometry (see Methods and Extended Data Fig. 6 
for further discussions). We notice that the transmission functions in 
Fig. 4d are finite only in an energy range from 0 to Emax © 20 meV, 
where Emax represents the highest phonon energy of Au. At room tem- 
perature all the transmission resonances in this energy interval deter- 
mine the actual value of the thermal conductance (see Methods section 
‘Computational methods, including equation (1), for further details), 
and they mainly arise from centre-of-mass motions of the molecule 
between the electrodes or low-energy molecular vibrations. For longer 
molecules more transmission resonances arise between 0 and Emax, 
since more molecular modes overlap with the phonon density of states 
of Au. In addition, we find that for all junctions the transmission values 
are below 3, which, as we have discussed before’, is related to the lin- 
ear, one-dimensional structure of the alkane molecules. Last, we note 
that anharmonic effects, which result in phonon-phonon scattering, 
could reduce heat flow. However, owing to the long wavelengths of 
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the vibrational modes relevant for thermal transport we expect such 
anharmonic effects to be small’. 

Our experimental results illustrate a nearly length-independent 
thermal conductance in alkane-based single-molecule junctions, which 
is in strong contrast with the corresponding exponential length depend- 
ence of the electrical conductance. In contrast to work on monolayers 
and polymer bundles, our work realizes the long-sought goal of unam- 
biguous identification of thermal conductance at the single-molecule 
level. Our ab initio computational analysis provides strong support for 
our experimental data regarding the length independence of thermal 
conductance, and offers mechanistic insights in terms of molecular 
vibrational properties. The experimental advances presented here will 
enable systematic studies of thermal transport through single-molecule 
junctions and other one-dimensional systems such as polymer chains, 
which are of great current interest but so far have remained experi- 
mentally inaccessible. 
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METHODS 


Nanofabrication of scanning thermal probes. To fabricate the probes (Extended 
Data Fig. 1), we start with a 500-\1m-thick double-sided silicon wafer and form 
an 18-j1m-deep and 1-1m-wide trench on the silicon wafer via wet oxidation and 
deep reactive ion etching (DRIE). A 600-nm-thick silicon nitride (SiN,) layer was 
deposited on both sides of the wafer via low pressure chemical vapour deposition 
(LPCVD) and the back side was patterned to facilitate etching using potassium 
hydroxide (KOH) for releasing the probe in the last step. A sensitive thermom- 
eter was defined by patterning a 30-nm-thick and 1-j1m-wide platinum (Pt) 
serpentine line. The tip was fabricated by first depositing a 100-nm-thick platinum 
film. Subsequently, a 30-nm-thick SiN, layer was deposited via plasma-enhanced 
chemical vapour deposition (PECVD) to protect the front side of the probe during 
KOH etching. Two shadow masks were introduced separately to deposit a sput- 
tered 50-nm-thick SiN, film on the serpentine-shaped Pt covered region and a 
500-nm-thick gold (Au) layer on the tip. 

Characterization of thermal, electrical and mechanical properties. Temperature 
coefficient of resistance (TCR). To measure the TCR of the Pt thermometer, a small 
a.c. current of amplitude I= 1 nA at frequency f = 200 Hz was supplied to the 
embedded Pt serpentine line on the probe, and the resultant 1f component of 
the voltage signal, V;, was measured using a lock-in amplifier (SR830) in a four- 
probe configuration. The temperature-dependent electric resistance defined as 
R(T) = V//I; was evaluated by varying the temperature of the probe inside a 
cryostat (Janis ST-100). A representative plot of the measured resistance of a 
scanning thermal probe as a function of temperature is shown in Extended Data 
Fig. 2a. The TCR can be obtained by using the slope of the best-linear-fit curve 
of the measured data points. At room temperature, the TCR was found to be 
(1.45 + 0.01) x 10°? KT}. 

Thermal time constant of the probe. The thermal time constant of the calorimetric 
scanning thermal probes was determined by applying a sinusoidal electrical current 
with constant amplitude Irat varying frequency f to the Pt resistor. This current 
enables sinusoidal Joule heating of the suspended island, Q.g with an associated tem- 
perature fluctuation at 2fand an amplitude of AT; The 3f component of the output 
voltage across the Pt resistor V3-was recorded using a lock-in amplifier (SRS 830). 
AT»can subsequently be determined according to the relation AT2¢= 2V3/(a,R), 
where a is the measured TCR and R is the electrical resistance of the Pt serpentine 
line. The measured AT ¢ which was normalized by the amplitude at the lowest fre- 
quency, is shown as a function of the heating frequency in Extended Data Fig. 2b. 
Note that the —3 dB point (thermal cut-off frequency) is ~7 Hz, which can be 
used to determine the time constant of the probe from 7 = (2nf_3gn)! & 25 ms. 
Thermal conductance of the probe. This was measured by applying a sinusoidal 
electrical current with fixed frequency fand varying amplitude I;to the embedded 
Pt serpentine line!? of the probe, resulting in Joule heating. The magnitude of the 
heating power in the serpentine line can be calculated as Qoy= I? R/2, resulting in 
a corresponding temperature increase AT», of the island (distal end of the probe). 
The heating frequency 2fwas chosen to be 1 Hz to ensure a full thermal response 
of the probe. Similar to the characterization of the thermal time constant, AT», 
can be quantified by recording the 3f component of the output voltage across the 
Pt line. Extended Data Fig. 2c displays the relationship between the amplitude of 
the measured temperature increase AT>;and the input heating power Qos. The 
thermal conductance of the probe can then be obtained via Gin,p = Qo AT which 
is estimated to be about 800 nW K™!. 

Mechanical stiffness of the probe. Simulations with the finite element method 
(FEM) were carried out using COMSOL Multiphysics (Solid Mechanics mod- 
ule, COMSOL) to estimate the stiffness of the thermal probes using the following 
boundary conditions: a force of 50 nN, either in the normal or the transverse 
direction, was applied at the end of the probe tip, while the other end of the SiN, 
cantilevers was fixed. From the computed resultant displacement field, the stiffness 
of the probe was estimated to be ~14,000 N m~! in the normal direction, and 
~275 N m7! and ~12.5 N m7! in the transverse directions, respectively (as shown 
in Extended Data Fig. 3a—c). In our experiments, the normal stiffness is found to 
be sufficiently large to form stable molecular junctions. 

Temperature distribution on the probe. Temperature fields generated on the probe 
due to d.c. Joule heating or d.c. heat input to the tip were simulated using COMSOL 
(Joule Heating and Thermal Expansion module). A 10-:A d.c. electrical current 
(Extended Data Fig. 3d) was supplied to the Pt line or a 10-\1W d.c. heat current 
was input at the tip (Extended Data Fig. 3e), while the ends of the SiN, cantilevers 
were held at 300 K. We note that in both cases the Pt thermometer embedded in 
the island exhibits a uniform temperature distribution and the temperature drop 
occurs primarily along the beams. 

Experimental set-up and measurement schemes. Ultra-low-noise measure- 
ment environment. All electrical and thermal measurements of single-molecule 
junctions were performed in a UHV (~10-® torr) scanning probe instrument 
(RHK UHV 750), which is housed in a test chamber of a low-noise facility where 
the mechanical floor vibrations are maintained below the NIST-A standard. 


The temperature drift of the chamber was actively controlled to vary below 
100 mK around a chosen set point at 295 K. 

Molecular sample preparation and cleaning protocol for probes. To facilitate the 
formation of single-molecule junctions during the experiments, self-assembled 
monolayers of alkanedithiol molecules were prepared on an ultra-flat planar 
Au-coated substrate, which was prepared via template-stripping. The Au-coated 
substrate was immersed in 500-1M ethanol solutions of alkanedithiol molecules 
(C2, C4, C6, C8, C10, from Sigma Aldrich with purity >95%) to initiate the 
self-assembly process of the molecules on the Au surface. After ~12 h of incuba- 
tion, the samples were thoroughly rinsed in ethanol and dried in a nitrogen-filled 
glove box before being transferred into the UHV measurement environment. 
Furthermore, in order to ensure high cleanliness of the Au-coated scanning 
thermal probes, which is critical for successful thermal transport measurements, 
we followed a protocol reported elsewhere**. We note that it is critical to avoid any 
direct contact of the probe with the ambient, and multiple cycles of wet and plasma 
cleaning are usually needed to eliminate any detectable contamination on the probe. 
Formation of single-molecule junctions and transport measurement circuitry. All 
the single-molecule junctions were created between the scanning thermal probes 
and molecule-covered Au substrates. During the measurement, the probe was con- 
trollably displaced towards the substrate at a speed of 1 nm s_', and withdrawn 
from the substrate at 0.05 nm s~! after making contact as indicated by a sufficiently 
large electrical conductance (compared to the single-molecule conductance). The 
withdrawal of the scanning probe was stopped once a single-molecule junction was 
formed, as indicated by an approximately constant electrical conductance that was 
within one standard deviation of the single-molecule conductance obtained from the 
conductance histogram. Simultaneous electrical and thermal conductance measure- 
ments were recorded for a constant electrode separation until the particular single- 
molecule junction spontaneously broke. The process of formation and breakdown of 
single-molecule junctions was repeated several hundred times for each type of molecule. 

The electrical conductance was measured by supplying a d.c. voltage bias 
(30 mV, 50 mV, 100 mV, 100 mV and 200 mV for C2-C10 junctions, respectively) 
across the scanning thermal probe and the Au substrate, while monitoring the 
tunnelling current across the junctions via a current amplifier (SR570). We note 
that the filter settings of the current amplifier resulted in an electrical time constant 
of ~2 ms in all our experiments (see Figs. 2b and 3b). Smaller voltage biases were 
chosen for shorter molecular junctions, which feature larger electrical conduct- 
ances, to minimize the effects of Joule heating. 

In order to study the thermal conductance of the junctions, the tempera- 
ture change of the scanning thermal probe was measured before and after the 
breakdown of the single-molecule junctions. For this purpose we monitored 
the change in the electric resistance of the embedded Pt resistance thermometer 
via a half-Wheatstone bridge. The output voltage signal of the bridge circuit in 
the presence of a d.c. electric current was first amplified by an instrumentation 
amplifier (AD524) with a gain of 100 and subsequently measured using a low-noise 
voltage amplifier (SR 560 with a gain of 100). 

Selection criteria for single-molecule traces. To analyse the thermal conductance of 
single-molecule junctions, we identify single-molecule events via off-line analysis 
from our continuous recordings by applying the following criteria: (1) the elec- 
trical conductance drops in a clear last step from a constant, expected electrical 
conductance value (corresponding to the previously established values, see above), 
signalling the presence of a single-molecule junction before breakdown; (2) during 
formation and following breakdown of the junction, the thermal conductance 
from the probe to the monolayer sample is relatively stable (drift <100 pW K"! 
in 0.5 s), signifying a thermal measurement that is not compromised by a large 
change of background conduction pathways. The first criterion ensures that 
we are only employing data from single-molecule junctions with well-defined 
electrical conductance corresponding to the most probable value, as identified from 
the analysis of the conductance histogram (Figs. 2a and 3a). The second criterion 
is principally informed by our past work*?, which suggests that the presence of 
organic contamination leads to parasitic conductances at the sub-nanowatt per K to 
nanowatt per K level. The variation in this background conductance as a function 
of time, if large, can limit the resolution of our time-averaging approach. In our 
analysis we found that drift values <100 pW K' ina 0.5-s period after rupture of 
contact are sufficient to achieve the desired signal-to-noise ratio. Rigorous appli- 
cation of the above criteria avoids artefacts and ensures reliable single-molecule 
thermal conductance measurements. A majority (>90%) of the curves that satisfy 
the first criterion were also found to satisfy the second. In the relatively rare events 
(<10%) where large background drift was observed, we deemed the experiment to 
have failed and excluded the curve from the thermal conductance analysis using 
a completely automated process. 

Noise reduction due to the time-averaging scheme. The measured temperature 
change of the scanning thermal probe is associated with substantial noise con- 
tributions from electronics (amplifiers), Johnson noise, shot noise and tempera- 
ture drift of the measurement environment. As shown in Fig. 2b, the unprocessed 


thermal signals are featureless and buried in large noise (~100 pW K~!). To 
improve the signal-to-noise ratio of the measurements, a time-averaging scheme 
is applied to the thermal conductance traces, which were acquired through inde- 
pendent measurements of many molecular junctions (~100). Briefly, the time at 
which a single-molecule junction breaks (f,) was first detected by analysing the 
time series of electrical conductance traces. Subsequently the AG, signal corre- 
sponding to the same electrical conductance trace was demarcated into a T= 0.5 s 
region (shaded in green in Fig. 2b, bottom panel) before f,, and two intervals 
7! = 0.1 s (unshaded) and 7 = 0.5 s (shaded in brown) after t,. We note that the 
7’ = 0.1 s interval (around four times the thermal time constant of the probe) 
corresponds to the time required to achieve the full thermal response of the probe 
after the junction-breaking event. Further, the average value of the AG, signal in 
the subsequent 0.5 s after breaking of the molecular junction is averaged and set 
to zero by suitably offsetting the curve. (This procedure ensures compliance with 
the physical expectation that the thermal conductance change after breaking of the 
junction is negligibly small.) Finally, the thermal conductance traces from each of 
the individual experiments were aligned using t, as the reference point, and data 
from corresponding time points were averaged. Following a procedure described 
in detail in our previous work’, we can estimate the smallest thermal conductance 
change (AG, min) detectable using our time-averaging scheme to be: 
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Here Gpoise(f) is the power spectral density at frequency f associated with the tem- 
perature noise that the probe is subject to, and 2T is the total time over which 
the averaging is performed. For example, for 100 molecular junctions the total 
averaging time 2T equals 110 s. By following the protocol that we have developed 
previously* to measure the noise spectrum of a Pt resistance thermometer, we can 
estimate the power spectral density. With this information and the above equation, 
we estimate AGih min to be ~2 pW K71. Finally, we note that the electrical conduct- 
ance traces show additional stepwise changes before rupture of the last junction (see 
Extended Data Fig. 4 top panels) that represent recordings during the withdrawal 
of the tip from multi-molecule junctions. These additional changes do not yield 
identifiable multiple conductance states in electrical conductance histograms (see, 
for example, Fig. 2a). Further, given the low thermal conductance of the studied 
molecular junctions, the thermal traces of multi-molecule junctions are, as expected, 
largely featureless (see Extended Data Fig. 4, bottom panels). An averaging approach 
analogous to the successful analysis of the thermal conductance of single-molecule 
junctions cannot yield the corresponding thermal conductances of these multiple- 
molecule junctions, as these states are not well correlated in time and step-size 
compared to the single-molecule junction states. 

Effect of Joule heating on measurements of the thermal conductance. In addition 
to averaging the signals from many individual thermal conductance events as 
discussed above, we need to account for the heat dissipation that results from 
the applied electrical bias. Specifically, when a voltage bias (V) is supplied across 
a junction of resistance R, it results in a total heat dissipation of V?/R. Since the 
Seebeck coefficient of alkanedithiol junctions is very small**, the heat dissipa- 
tion in the electrodes is symmetric to an excellent approximation’. Therefore, the 
heat dissipated in the probe due to the voltage bias is given by V’/2R. When the 
single-molecule junction is broken, the probe not only heats up due to the loss of 
a thermal conduction pathway, but there is also a competing effect that attenuates 
the temperature drop as the heat dissipation in the probe decreases by V7/2R. In 
order to systematically account for this effect, we add A Tyoule = V7/(2RGin,p) to 
the measured data in the range 0 to f) seconds (that is, for the region before the 
junction is broken) to obtain the ATp plot in Fig. 2b and all other related AG,, 
plots shown in the manuscript (Figs. 2c, 3b). These corrections are modest for C6 
(~20%), C8 (<2%) and C10 (<2%) junctions, but they are sizable for C2 (~60%) 
and C4 (~30%) junctions. Here, all the percentages represent how large ATyoute 
is with respect to the observed temperature drop ATp when the junction breaks. 
Effect of intermolecular interactions on the measurements of the thermal conduct- 
ance. Given the fact that all our experiments are performed in a UHV environment, 
the probability of interaction between the single-molecule junction of interest and 
other molecules is much smaller than that in experiments performed in a solution 
environment, where several surrounding molecules can potentially interact with 
the junction. Further, it is to be noted that in our experiments the single-molecule 
junction is expected to be isolated from surrounding molecules as the junction 
is created between elongated Au chains that protrude out from the electrodes. 
Therefore, the probability of interaction with surrounding molecules is expected 
to be very low. 
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Influence of near-field radiative heat transfer. Near-field radiative heat transfer has a 
negligible impact on our measurements. In particular, our strategy for determining 
the thermal conductance of a single-molecule junction relies on measuring the 
change of the thermal conductance when a single molecule that is bridging the 
calorimeter and the substrate breaks away. In the absence of any drift in the gap 
size, the near-field contribution before and after the junction-breaking event is 
identical. For this reason, near-field thermal radiation makes no contribution to 
the measurement, as we are determining the change in the thermal conductance 
upon the breakdown of the junction. The drift in the gap size of our system of 
<1 A min | translates to <15 pm ina 1-s time interval. Given our past analysis** > 
the near-field radiative conductance change is expected to be ~2 pW K“! when the 
gap size changes by ~1 nm. Therefore, the change in the near-field contribution 
due to gap size drift of ~15 pm is negligibly small (<0.03 pW K~!) when compared 
to the thermal conductance of a single-molecule junction (~20 pW K~}). 

Computational methods. Thermal conductance within the Landauer-Biittiker 
approach. To calculate the thermal conductance of molecular junctions, we 
employ the Landauer-Biittiker formalism for coherent transport!???397, This 
theory describes phonon transport as phase-coherent and elastic. Since in this 
formalism transport is described as a scattering problem of waves, the key quantity 
is the probability 7,,(E) of a phonon at a given energy E to be transmitted from 
one lead to the other, which is computed using the procedures developed in our 
past work!5?9!, The linear response coefficient (Gin,smy) can be calculated using: 
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0 
where n(E, T) = [exp(E/kgT) — 1]! is the Bose function. The thermal conductance 
is given as an energy integral over the transmission function weighted by energy 
and the temperature derivative of the Bose function, which considers the energetic 
content of the transmitted phonons and the difference in phonon populations in 
the leads, respectively. 
DFT modelling. In the calculation of the phononic transmission function, the 
dynamical matrix, which describes the mechanical coupling of individual atoms 
in the molecular junction at the microscopic scale, plays a key role. To obtain the 
dynamical matrix, we calculate the second derivative of the Born-Oppenheimer 
energy landscape**” by using density functional perturbation theory, as imple- 
mented in the quantum chemistry software package TURBOMOLE, version 7.1*°. 
Total energies are converged up to a precision of 10~° a.u., and geometries are 
optimized until the change of the maximum norm of the Cartesian gradient is 
smaller than 10~° a.u. We use the Perdew-Burke-Ernzerhof exchange-correlation 
functional*! and the default2 basis set of split-valence-plus-polarization quality 
def2-SV(P)**“ in combination with the corresponding Coulomb fitting basis. 
Junction geometries and pulling curves. All junction geometries studied in Fig. 4 
are built up from two gold pyramids oriented in the crystallographic (111) direc- 
tion: one end of the straight alkane chain is attached via a sulphur group to the 
tip atom of one of the pyramids, and the other end of the chain is attached via a 
sulphur atom to the other pyramid. The atomically sharp pyramids model probe 
and substrate metal electrodes close to the point of rupture, when they have been 
deformed by mechanical stress and gold atoms have been pulled out of the soft but 
initially flat substrate surface. The positions of the atoms in the central junction 
part, consisting of the molecule and the two metal layers closest to it, are optimized 
by energy minimization, while the Au atoms in the two outermost rows of the Auyo 
pyramids on each side, that is, those most distant from the molecule, are kept fixed. 
To generate the pulling curves of Fig. 4, the junctions are adiabatically stretched by 
displacing the frozen part of the gold atoms on one side of the molecular junction 
in the direction of the difference vector between the Au tip atoms with a step size 
of d=0.5a.u. = 0.26 A, and optimizing again all the atoms in the central junction 
part under the new boundary condition set by the fixed outermost Au layers. 
We note that in these simulations, we mimic the experiments and obtain contact 
geometries that correspond to those manifested in single-molecule junctions at 
breakdown*. 
Additional thermal conductance-distance traces. In Fig. 4 we present thermal con- 
ductance versus distance curves for molecular junctions of the five different mole- 
cules C2-C10. In each case a gold atom is pulled out from the Au electrodes to yield 
a short gold chain in the form of a dimer before the contact breaks. To inspect the 
robustness of the results, we performed further simulations of junction stretching 
processes and show in Extended Data Fig. 5 additional pulling curves for C2, 
C6 and C10. Here the gold atoms at the tips move from a three-fold hollow to a 
two-fold bridge position before the contact breaks, as is visible from the geometries 
displayed in the insets of Extended Data Fig. 5. We note that in this analysis the 
initial geometries differ from those shown in the main text with respect to the 
orientation of the molecule on the pyramidal leads, but the stretching protocol is 
otherwise identical. The computed thermal conductances at rupture are slightly 
higher than those shown in Fig. 4, since the blue region is missing, but they are 
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within the uncertainty of the measured thermal conductance values (Fig. 3c). As 
these examples show, the formation of gold chains in our simulations depends sen- 
sitively on the starting geometry and the details of the adiabatic stretching process. 
A faithful reproduction of the mechanical deformation of the macroscopically 
large gold electrodes will require a larger number of flexible gold atoms than what 
we can at present use in our computationally demanding ab initio simulations. 
Influence of variations of the contact geometry. As for any mechanically controlled 
break-junction technique, junction geometries in the experiment are not well 
controlled at the atomic scale, and the space of possible configurations is huge. 
This leads to uncertainties with regard to molecular configuration, molecule- 
electrode coupling and electrode orientation. Assuming alkane molecules to be 
fully stretched before contact rupture, it is interesting to explore the variation of 
phonon thermal conductance as the geometry of the contacts is varied. In Extended 
Data Fig. 6 we show the computed changes in the energy-dependent phonon trans- 
mission for Au-C10-Au single-molecule junctions with different contact geome- 
tries. Specifically, we find that peak positions and peak widths in the transmission 
spectrum depend on the precise atomic geometries. 

In Extended Data Table 1, we present the computed thermal conductances 
for the four Au-C10-Au junctions shown in Extended Data Fig. 6, but include 
also the data for corresponding junction types containing C2-C8. We also list the 
resulting standard deviations for each molecule that are found to be in the range of 
3-7 pW K and in close correspondence to the standard deviation of the measured 
thermal conductances (see Fig. 3c). We note that in this analysis we designed the 
different junction types such that stress is minimized, in order to concentrate on 
the effects of metal-molecule binding and electrode orientation. The molecular 
contacts are therefore located inside the yellow-shaded area of Fig. 4, which results 
in a somewhat larger thermal conductance than those obtained in our experiments. 
Electronic contributions to the thermal conductance. In Fig. 4 of the main text we 
have estimated Gy... with the help of the Wiedemann-Franz law, based on the 
mean experimental single-molecule electrical conductance value. The 
Wiedemann-Franz law reads Gth,-1 = LoTGa, where the Lorentz number 
Ly = 1k; /3e° = 2.44 x 10 * WOK > Tis the temperature, and Gy the electrical 
conductance. Using the experimental values G,; = 10°*Gp for C2, and 
Ga = 2 x 1073Gp for C4, in addition to T = 300 K, we obtain the data for Gina 
given in the text. 

Transmission eigenchannels. To obtain further information about heat transport in 
nanosystems, we decomposed the phonon transmission function, Tpn(E), into ener- 
gy-dependent contributions T,,(E) of individual transmission eigenchannels i: 


Tpn(E) = 0), Th, (ED (2) 


These eigenchannels are scattering states, and the transmission coefficients 
0 < Tpni(E) < 1 are the eigenvalues of the transmission probability matrix’. 

In Extended Data Fig. 7 we display, along with Tp,(E) for i= 1, 2, 3, the most 
transmissive eigenchannel i = 1 of C2, C6 and C10 at selected energies. These 
are the highest energies at which a transmission resonance occurs with a value 
close to 1. Note that we show here a static representation of the eigenchannels in 
terms of the real part at time t = 0, despite the general solution being complex or 
time-dependent. Similar to the discussion in our past work”’, a close relation of the 
molecular vibrations to the transmission eigenchannels often exists. We observe 
that for C2 junctions only the centre-of-mass motions of the molecule contribute 
to phonon transport due to the short molecular length. However, for C6 and C10, 
genuine molecular modes carry heat. This is evident from direction changes of the 
arrows in Extended Data Fig. 7 that indicate the atomic motion, when going from 
one end of the molecule in the junction to the other. 
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Extended Data Fig. 1 | Fabrication steps for thermal probes. Step 1, 
“T’-shaped cantilever patterning. Step 2, deposition of Pt for the serpentine 
heater-thermometer, pads and the tip. Step 3, SiN, layer deposition for 
front side KOH etching. Step 4, probe cantilever release. Step 5, aligning 
each probe on the first shadow mask using a thin low-temperature 
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crystal bond layer. Step 6, SiN, sputtering on the serpentine Pt heater- 
thermometer. Step 7, aligning each probe on the second shadow mask. 
Step 8, Au sputtering on the tip region. Step 9, detaching the scanning 
probe from the shadow mask and removing the residual crystal bond by 
‘piranha cleaning. 
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Extended Data Fig. 2 | Characterization of thermal and electrical 
properties of the scanning thermal probes. a, Measured electrical 
resistance of the Pt heater-thermometer as a function of temperature. 

b, Measured thermal response of the scanning probe as a function of the 
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heating frequency. c, Calibration of the thermal conductance of the probe 
(input heating power provided to the Pt heater-thermometer plotted 
against the temperature rise of the probe). The slope of the dashed fitted 
line corresponds to the thermal conductance of the probe. 
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Extended Data Fig. 3 | Evaluation of mechanical properties and spatial at right) of the scanning thermal probe when a 10-A d.c. current was 
temperature variation of the scanning thermal probes. a—c, A force of applied to the embedded serpentine Pt heater-thermometer (d) and a 
50 nN was applied either in the normal or the transverse directions of the 10-.W heat current was input from the tip (e). The spatial temperature 
beams, and the deflection for each case was computed (‘Displacement distribution on the island is very uniform (<5% change across locations), 
colour key). The stiffness of the probe was calculated to be 14,000 N m7! supporting the expectation that the distributed Pt heater-thermometer 
(a), 275 Nm“! (b) and 12.5 N m ! (c), respectively, for the normal and accurately measures the temperature of the suspended region. 
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Extended Data Fig. 4 | Sample electrical and thermal conductance 
traces for Au-C6-Au molecular junctions. a, b, Two independent 
sample recordings. Top and bottom panels show electrical and thermal 
conductance, respectively. The green- and yellow-shaded regions mark 
portions of the recordings that capture the rupture of a single-molecule 
junction to which the time-averaging scheme is applied, while the 
blue-shaded regions during the earlier portions of the withdrawal cycle 
represent recordings that contain events involving multi-molecule 
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junctions. A clear last step can be identified in the electrical conductance 
traces (green-yellow region), indicating the breakdown of a single- 
molecule junction. As can be seen, there are also additional steps before 
the last step in the blue-shaded region. The corresponding thermal 
conductance traces that are shown below each electrical conductance trace 
do not reveal any thermal conductance steps, owing to the low signal-to- 
noise ratio. Insets, schematics of single-molecule junctions before and 
after rupture. 
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Extended Data Fig. 5 | Simulated thermal conductance as a function of 
electrode displacement for C2, C6 and C10 single-molecule junctions. 
The computed thermal conductance data are shown as black dots, 
coloured regions have the same meaning as in Fig. 4, and snapshots of 


junction structures are displayed as insets. The initial junction geometries 
before displacement of the electrodes differ from those used to generate 
the corresponding plots in Fig. 4, but the procedure employed for 
stretching the junctions is the same. 
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Extended Data Fig. 6 | Influence of the contact geometry on computed 
phonon transmission for Au-C10-Au single-molecule junctions. 

a, Different junction types that are used to evaluate the effect of contact 
geometry on the phonon transmission functions. Each terminal sulphur 
atom in the junction is attached to a single Au tip atom (JT1), to two Au 
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tip atoms (JT2, JT3) or to three Au atoms (JT4). In these geometries, 
electrodes are oriented along the (111) crystallographic direction (JT1, 
JT4), the (110) direction (JT2) and the (100) direction (JT3). b, Phonon 
transmission as a function of energy for the different junction geometries 
illustrated in a. 
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Extended Data Fig. 7 | Phonon transmission eigenchannels for C2, in the transmission of eigenchannel i = 1 is found to occur at energies 
C6 and C10 junctions. a—c, Displacement patterns associated with of around 13.5 meV, 14 meV and 18 meV for C2, C6 and C10 junctions, 
the mode shapes of the most transmissive eigenchannel i = 1 for C2, respectively, and is indicated by the green bars. The displacement patterns 
C6 and C10, respectively, evaluated at energies of 13.5 meV, 14 meV of transmission eigenchannel i = 1 in a—c have been evaluated at these 
and 18 meV. d, Phonon transmission associated with each of the three energies. 


eigenchannels i = 1, 2, 3 for C2, C6 and C10 molecular junctions. A peak 
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Extended Data Table 1 | Calculated thermal conductance 


Gian Gite Gora Gy sta Standard 
Molecule Deviation 
(pW/K) (pW/K) (pW/K) (pW/K) (pW/K) 
C2 28.0 21.9 37.7 30.4 6.5 
C4 29.7 30.9 41.4 41.0 6.3 
C6 33.9 31.7 36.9 40.3 3.7 
c8 35.5 35.5 42.2 32.0 4.2 
C10 44.0 34.8 45.3 32.7 6.4 


Columns 2-5 show the calculated thermal conductance of Au-alkanedithiol-Au single-molecule 
junctions for four types of junction geometries, respectively JT1, JT2, JT3 and JT4. The entry in 
column 1 shows the alkanedithiol molecule concerned. Junction geometries are either identical 
to those shown in Extended Data Fig. 6 (for C10) or similar to them (C2-C8). Each terminal 
sulphur atom attaches to a single Au tip atom for JT1, to two Au tip atoms for JT2 and JT3, or 

to three Au tip atoms for JT4, while the electrodes are oriented along the (111) crystallographic 
direction for JT1 and JT4, the (110) direction for JT2 or the (100) direction for JT3. The standard 
deviation (column 6) indicates the variability of the thermal conductance as determined for each 
molecule from the four different junctions types. 
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Efficient molecular doping of polymeric 
semiconductors driven by anion exchange 
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The efficiency with which polymeric semiconductors can be 
chemically doped—and the charge carrier densities that can thereby 
be achieved—is determined primarily by the electrochemical 
redox potential between the n-conjugated polymer and the dopant 
species”. Thus, matching the electron affinity of one with the 
ionization potential of the other can allow effective doping**. Here 
we describe a different process—which we term ‘anion exchange’— 
that might offer improved doping levels. This process is mediated 
by an ionic liquid solvent and can be pictured as the effective 
instantaneous exchange of a conventional small p-type dopant anion 
with a second anion provided by an ionic liquid. The introduction 
of optimized ionic salt (the ionic liquid solvent) into a conventional 
binary donor-acceptor system can overcome the redox potential 
limitations described by Marcus theory’, and allows an anion- 
exchange efficiency of nearly 100 per cent. As a result, doping levels 
of up to almost one charge per monomer unit can be achieved. This 
demonstration of increased doping levels, increased stability and 
excellent transport properties shows that anion-exchange doping, 
which can use an almost infinite selection of ionic salts, could bea 
powerful tool for the realization of advanced molecular electronics. 

Chemical doping of x-conjugated materials necessarily involves 
redox reactions between the host and dopant?. In this process, an inte- 
ger number of electrons is transferred from the host to the dopant via 
electron transfer in the ground state, which is well described by Marcus 
theory°. Because the driving force in donor-acceptor association is 
dominated primarily by the electrochemical redox potential between 
the x-conjugated material and the dopant, efficient doping occurs only 
when charge transfer is energetically favourable. To achieve higher dop- 
ing efficiency (for example, p-type doping), the electron affinity of the 
acceptor (the dopant) should match or exceed the ionization potential 
of the host material?. 

Various dopants and processes have been used to achieve the effi- 
cient chemical doping of organic semiconductors. Despite the success- 
ful tuning of the electron affinity of various conjugated molecules to 
promote efficient doping, increasing the electron affinity often causes 
chemical instability. This represents a major challenge to extending the 
scope of potential molecular dopants, and there have been attempts 
to use photo-assisted doping to mitigate this problem’. In addition to 
charge-transfer interactions based on redox reactions, Coulomb inter- 
actions such as hole-hole and hole-counterion interactions in the case 
of p-type doping*” are an important aspect of organic-semiconductor 
doping, suggesting the possibility of optimizing molecular doping by 
tuning Coulomb or ionic interactions®”. 

Here we demonstrate a general strategy for overcoming 
charge-transfer limitations by using anion exchange. We focus on the 
chemical doping of the well-studied thiophene-based conjugated pol- 
ymer poly(2,5-bis(3-tetradecylthiophen-2-yl)thieno[3,2-b]thiophene) 
(PBTTT)"° in conjunction with tetrafluorotetracyanoquinodimethane 
(F4ATCNQ)!}, as shown in Fig. 1a, introducing an additional anion to 


1,2 


the host-guest system (see Extended Data Fig. 1 and Supplementary 
Information section 1.1). This combination of substances results in 
spontaneous exchange of the FATCNQ radical anion and the newly 
introduced anion with near-unity exchange efficiency. The anion- 
exchange doping process used here results in remarkable improve- 
ments in both the doping level and the thermal durability of the doped 
material. 

Our anion-exchange doping process involves a host-guest system 
that also includes a large excess of a salt (consisting of a cation, Xa 
and an anion, Y ; Fig. 1b). As an example, an ionic liquid in which X is 
1-ethyl-3-methylimidazolium (EMIM) and Y is bis(trifluoromethylsul- 
fonyl)imide (TFSI) can be used. Surprisingly, in this scenario, the TFSI 
anions are instantaneously exchanged with the FATCNQ radical anions 
that form the intermediate ion pair [PBTTT** F4TCNQ®”] (Fig. 1b). 
Here, we realized anion-exchange doping by using the ionic liquid 
EMIM-TFSI instead of n-butylacetate as a solvent for FATCNQ (details 
of this method are provided in Supplementary Information section 
1.2). We confirmed hole doping of the polymer via charge transfer by 
observing bleaching of the neutral absorption of PBT TT at 553 nm 
(Fig. 1c). In contrast to a conventional binary system, a characteristic 
doublet originating from the F4TCNQ radical anion was not seen in the 
absorption spectrum of the doped film. We further verified the absence 
of FATCNQ radical anions in the PBT TT by Fourier transform infrared 
(FTIR) spectroscopy (Fig. 1d and Extended Data Fig. 2). Specifically, 
the FTIR spectrum of the anion-exchanged film does not show the 
peak assigned to the C=N stretching mode of the FATCNQ radical 
anion (2,190 cm™; ref. !”), indicating highly efficient anion exchange 
(FATCNQ*- — TESI-). 

We also confirmed the absence of the F4TNQ radical anion in the 
PBT TT by Raman spectroscopy (Supplementary Information section 
1.3) and electron spin resonance measurements. We estimated the 
Curie spin concentration resulting from the F4TCNQ radical anions 
to be 1.7 x 10’? cm? (Extended Data Fig. 3), which is very low com- 
pared with the actual doping level of roughly 1 x 10?! cm~? derived 
from Hall effect measurements. Therefore, we conclude that the lower 
limit of the anion-exchange efficiency for this process (F4LTCNQ*” — 
TFSI_) is 98%. This near-unity exchange efficiency suggests that the 
present anion exchange is driven by the host-guest hybrid system seek- 
ing to achieve a minimum free energy value. We note that hole doping 
is never observed when a PBT TT thin film is immersed in a pure ionic 
liquid (Extended Data Fig. 4), showing that the FATCNQ has a vital 
role by producing the initial hole doping, following which the FATCNQ 
radical anions are exchanged with Y~. Note that because a large excess 
of TFSI” is introduced into the hybrid anion system, entropy gain 

(TESI7] 
[FATCNQ™ ] 
below (here, kg is the Boltzmann constant and T is temperature). 

Ion exchange is applicable to various chemical processes, and oper- 
ates not only on the principle of the binding preferences of ion pairs, 


according to —k,TIn can be expected, and will be discussed 
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4 Conventional molecular doping 
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Fig. 1 | Summary of anion-exchange 
doping. a, b, Diagrams showing conventional 
molecular doping (a) and anion-exchange 
doping (b). The inset at the top right shows 
the chemical structures of PBTTT and 
F4TCNQ. In anion-exchange doping, a thin 
film of PBTTT is doped with the initiator 
molecule FATCNQ (i) via a charge-transfer 
interaction. ii, The FATCNQ radical anions 
are replaced by the Y~ anions (where Y is 
TFSD of the ionic liquid. iii, This anion 
exchange (FATCNQ®” — TFSI~) results in 
the formation of a solid-state donor-acceptor 
complex ([PBTTT** TFSI” ]). ¢, d, Optical 
absorbance (c) and FTIR spectra (d) of 
pristine PBTTT (black), FETCNQ-doped 
PBTTT (orange), and PBT TT doped via 
anion exchange (blue). The centre value of the 
peak marked with a single asterisk is 553 nm. 
The centre values of the doublet peak (double 
asterisk) are 775 nm and 881 nm. 
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which in turn depends on ionic interactions, but also on statistics—that 
is, on entropy. Here, working under the assumption that the energetic 
gain via anion exchange is established to some extent by the entropy 
factor, we assessed the role of these ionic interactions by comparing 
combinations of eight cations, X*, and six anions, Y~, in conjunction 
with systematic variations in the effective ion radius, Reg, that sepa- 
rates the charges. The correlation between the sizes of the molecular 
ions having unique shapes and the associated ionic interactions can be 
determined from spatial maps of the electrostatic potentials based on 
van der Waals surfaces (Fig. 2; see Supplementary Information section 
1.4). The results of density functional theory (DFT) calculations show 
that smaller ions have higher electrostatic potentials on their surfaces. 
We discuss here the roles of ionic interactions on the basis of experi- 
mental observations, which demonstrate that the anion-exchange effi- 
ciency (F4TCNQ®*®” — Y_) and doping concentration both correlate 
with the strength of the ionic interaction. Specifically, the highest dop- 
ing concentration and most efficient anion exchange are realized when 
using a salt composed of small cations and large anions, which possess 
high and low electrostatic surface potentials, respectively. 

Initially, we assessed the manner in which the electrostatic poten- 
tial of the ionic liquid anion Y~ affects the anion-exchange doping 
by considering ionic liquids with four different anions. These anions 
were tetrafluoroborate (BF4_), hexafluorophosphate (PF.¢_), tris(pen- 
tafluoroethyl)trifluorophosphate (FAP~) and TFSI. In each case, the 
associated cation was 1-butyl-3-methylimidazolium (BMIM"‘). Anion- 
exchange doping was performed by immersing PBTTT thin films in a 
solution of FATCNQ dissolved in each ionic liquid for 10 min at 60°C. 
Figure 3a presents the optical absorption spectra obtained from PBTTT 
thin films doped via anion exchange with the four different anions. 
The F4TCNQ®” doublet at 775 nm and 881 nm is observed only for 
Y =BFy, and PF¢ , and this result is indicative of inefficient anion 


Wavenumber (cm) 


exchange. Interestingly, although smaller anions (such as BF,” and 
PF,_) should be more mobile than larger anions (FAP~ and TFSI"), the 
anion-exchange efficiency shows the opposite trend. This clearly sug- 
gests that the present anion exchange proceeds only when the exchange 
lowers the free energy of the system, where the gain of Gibbs free energy 
is defined as A,,. The results of DFT calculations also show that anion 
exchange with TFSI is more energetically favourable than that with 
BF, (see Supplementary Information section 1.5). This binding pref- 
erence can be understood by considering both the size and the shape 
of the ions, because delocalized anions such as TFSI will preferentially 
couple with the delocalized charges on the PBT TT. 

We further assessed the mechanism associated with the anion- 
exchange doping here, which is based on the energetic gain A,,, by 
investigating the effect of the cation. These trials used three different 
cations: Lit, EMIM* and methyltributylphosphonium (MtBPho*). In 
each case, Y- was fixed as TFSI-. The cation does not directly interact 
with the PBTT'T, but rather can be considered as a spectator ion in 
the anion-exchange process that nevertheless affects the A., value by 
forming ion pairs. Assessing the disappearance of the FATCNQ®*” dou- 
blet shows that only MtBPhot did not permit efficient anion exchange 
(Fig. 3b; see Extended Data Fig. 5 for additional data). One possible 
explanation for the beneficial effects of a small cation is that the initial 
ion pair must be composed of ions with poor affinity for one another 
(for example, a small cation and large anion) so as to produce a large 
A.x. This explanation is in good agreement with the empirical hard and 
soft acid and base (HSAB) theory, which is often invoked to explain 
ion-exchange tendencies’. 

The optimization of X + and Y~ is of great importance, not only 
because it affects the anion-exchange efficiency, but also because of 
its effect on the doping level, as verified by conductivity measure- 
ments. During these analyses, we doped PBTTT thin films via anion 
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Fig. 2 | Molecular structures and electrostatic potential maps of the 
cations and anions used here. These molecular structures and calculated 
electrostatic potentials are based on van der Waals surfaces of the anions 
and cations. Re denotes the effective molecular radius, equal to the radius 
of a sphere with the same volume as the organic ion. Ionic radius values 
were used for Lit and Na‘. The spatial distributions of electrostatic 


exchange using F4TCNQ together with either Li- TFSI or EMIM-TFSI. 
Surprisingly, the conductivity of the film anion-exchanged with EMIM- 
TFSI was increased by a factor of 1.7 compared with doping solely with 
F4TCNQ, while that of the Li-TFSI specimen was 2.4 times higher 
(Fig. 3c), which suggests that the doping level was improved as a result 
of the anion-exchange phenomenon when using an optimized com- 
bination of X* and Y~. It could be argued that the concentration of 
FATCNQ (F4TCNQ®”) and TFSI affected the conductivity. We found 
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Fig. 3 | Variations in anion exchange and doping concentration with 
the strength of ionic interactions. a, Optical absorption spectra of anion- 
exchange-doped PBTTT thin films using various Y~ anions (blue, TFSI"; 
green, FAP’; orange, PF, ; and red, BF, ). b, Optical absorption spectra 
of anion-exchange-doped PBTTT thin films, using various spectator X* 
cations (red, Lit; blue, EMIM*; and orange, MtBPho’*). c, Variations in 
the conductivity of anion-exchange doped PBTTT thin films. The error 
bars in the conductivity show uncertainty in the thickness of PBTTT thin 
films, and represent one standard deviation. d, Photoelectron yield spectra 
acquired from doped PBTTT thin films. 
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potentials were calculated using DFT with the B3LYP functional and 
6-311+G(d) basis set (Spartan’16 software). Detailed molecular structures 
are shown in Extended Data Fig. 1. BOB, bis(oxalato)borate; BPyri, 
1-butylpyridinium; BtMA, butyltrimethylammonium; MPPyrr, 1-methyl- 
1-propylpyrrolidinium; PFSI, bis(pentafluoroethanesulfonyl)imide. 


that the doping efficiency (conductivity) becomes less when the supply 
of reactant, here TFSI, is lower (Extended Data Fig. 6). Indeed, at the 
concentrations of FATCNQ®” and TFSI used here, the entropy gain is 
estimated to be approximately 200 meV, which is comparably large with 
respect to the ionic interactions. In considering the role of this entropy 
gain in our hybrid anion system, our focus is how the ionic interac- 
tions (binding preference) can have an effect on the doping efficiency 
(conductivity), and to what extent the doping level can be increased by 
tuning ionic compounds in anion-exchange doping. Specifically, the 
most efficient anion exchange, resulting in the highest conductivity of 
620 Scm~|, is realized when using Li-TFS]; this is one of the highest 
values yet reported for doped PBT TT thin films. 

We also monitored the degree of doping with photoelectron yield 
spectroscopy. Figure 3d plots the photoemission yield, 7, as a func- 
tion of the incident photon energy. From the threshold, we estimate the 
ionization potential, J,, to be 4.83 eV for the pristine PBT TT thin film, 
which is identical to the energy level at the edge of the highest occupied 
molecular orbital (HOMO) band for this material, and close to the 
literature value (—4.7 eV; ref. 1). Compared with F4ATCNQ doping, 
a larger shift in the threshold photon energy is clearly obtained fol- 
lowing anion-exchange doping with Li-TFSI. In this case, we estimate 
I, to be 5.38 eV, exceeding the lowest unoccupied molecular orbital 
(LUMO) level of FATCNQ and suggesting that anion-exchange dop- 
ing energetically stabilized the final state by a factor of A, (Extended 
Data Fig. 7), which has a value of several hundred millielectronvolts. 
Thus, anion exchange is expected to be a driving force to overcome 
the redox potential limitations in molecular doping. We further 
verified this hypothesis through the successful anion-exchange doping 
of the donor-acceptor copolymer poly[2,5-(2-octyldodecyl)-3,6-diketo- 
pyrrolopyrrole-alt-5,5-(2,5-di(thien-2-yl)thieno[3,2-b]thiophene)] 
(PDPP-2T-TT-OD). We found anion-exchange doping to allow rea- 
sonably high doping levels even in donor-acceptor polymers that typ- 
ically have a deep HOMO level (Extended Data Fig. 8). Another control 
experiment is shown in Extended Data Fig. 8, where a weak acceptor, 
tetracyanoquinodimethane (TCNQ), was used as an initiator instead 
of FATCNQ. Even though the LUMO level of TCNQ does not exceed 
the J, value of PBT TT, the introduction of Li- TFSI promotes doping (as 
shown by the bleaching of the neutral absorbance). Therefore, we con- 
clude that anion exchange is indeed a driving force behind the doping 
level. Supplementary Information section 1.6 provides a summary of 
the kinetic mechanism responsible for the increased doping levels. The 
overall results show that introducing a salt into a conventional binary 
molecular doping system can overcome the redox potential limitations 
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Fig. 4 | Highly ordered structures and coherent charge transport in 
doped PBTTT. a, Area-normalized X-ray diffraction profiles along the 
out-of-plane direction for PBTTT thin films (black, pristine; orange, 
F4TCNQ-doped; blue, anion-exchange-doped with EMIM-TFSI; and 
red, anion-exchange-doped with Li-TFSI). b, Variations in the 
d-spacing and FWHM values of (100) scattering peaks. The error bars in 
FWHM stem from compound errors that result from propagation of the 
uncertainties in fitting of the diffraction peak, and represent one standard 
deviation. c, Temperature dependence of the normalized Hall mobility 
Mya (T)/ Han (300 K). From the Hall effect measurements in Extended 
Data Fig. 9, we determined the Hall mobilities at 300 K (jrHan (300 K)) 


described by Marcus theory, thus allowing further improvement of 
molecular doping levels through optimization of ionic interactions. 

The remarkable enhancement in doping levels achieved by anion-ex- 
change doping suggests that the additional anion (such as TFSI”) is 
incorporated into the host polymer thin film to achieve charge neutral- 
ity. We verified this solid-state intercalation by X-ray diffraction (XRD) 
analyses. Figure 4a shows out-of-plane XRD profiles for various PBTTT 
thin-film specimens. Here, the first-order (400) diffractions assigned to 
lamellar spacing (d-spacing) are plotted against the scattering vector, q. 
As in the conductivity measurements (Fig. 3c), the samples consisted 
of pristine PBTTT (black), PBT TT doped with F4TCNQ (orange), and 
PBTTT doped via anion exchange with EMIM-TFSI (light blue) or 
with Li-TFSI (red). We determined the d-spacing and full width at half 
maximum (FWHM) values by Gaussian peak fitting, with the results 
summarized in Fig. 4b. These data show that the conductivity of the 
doped PBTTT thin films increased (Fig. 3c) along with the d-spacing. 
The increases in the lamellar spacing suggest that the counteranions 
(F4ATCNQ®*” or TFSI~) were incorporated into the zones occupied 
by alkyl side chains, which is consistent with previous reports’?!>®. 
It is also evident that the FWHM values decrease as the doping level 
increases. Thus, it appears that the extent of lattice disorder decreases 
dramatically. We note here that the relaxation of torsion, tension and/ 
or bending along the polymer backbones may result in crystal rear- 
rangement (Supplementary Information section 1.3). 
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to be 1.9, 2.4 and 2.0 cm? V-! s~! for FATCNQ-doped PBTTT thin film 
(orange) and films anion-exchange-doped with EMIM-TFSI (blue) and 
with Li-TFSI (red), respectively. d, Temperature dependence of the Hall 
carrier density, nau. e, Effects of the magnetic field (B) on differential 
sheet conductivity (Ao = o(B) — o(0)) for Li-TFSI-doped film at various 
temperatures, with B applied perpendicular to the substrate plane. 

f, Effects of temperature on the phase-coherent length (4). The error 
bars in Hall mobility, Hall carrier density and \., were determined from 
uncertainty in the extraction of electromotive force from the fitting, and 
represent one standard deviation. 


We assessed two-dimensional, coherent charge transport in the 
polymer thin film by magnetotransport analyses, in which we meas- 
ured both the longitudinal and the transverse electromotive forces 
while applying an external magnetic field using a standard Hall bar 
geometry. The Hall voltage is observable only when the charge car- 
riers are equivalent to a free electron—that is, when the wavenumber 
is definable in the charge-transport system!”~'*. Here, we observed a 
clear Hall voltage over a wide range of temperatures from 300 K down 
to 2 K, with the symmetry and sign of the voltage corresponding to 
the hole carrier conduction (Extended Data Fig. 9). Figure 4c, d plots 
Hall mobility, jy, and Hall carrier density, nyu, determined from a 
standard expression of the Hall effect. A remarkably high carrier con- 
centration of more than 1 x 107! cm~? (Fig. 4d) was achieved, together 
with a reasonably high Hall mobility of 2 cm? V~! s~! (Fig. 4c). This 
concentration is equivalent to one hole per monomer unit (representing 
a half-filled state), and is also approximately three times larger than that 
obtained with conventional FATCNQ doping". Surprisingly, the Hall 
mobility was almost completely unaffected by temperature. Although 
a finite fraction of localized carriers hinders a truly metallic signature, 
the temperature-invariant Hall mobility indicates that the obtained 
half-filled state in highly conductive PBTTT thin films is close to the 
onset of metallicity”. 

Figure 4e summarizes the magnetic-field dependence of the differ- 
ential sheet conductivity of a film (Ao = o(B) — o(0), where a is the 
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electrical conductivity and B is the magnetic flux density) at various 
temperatures. The positive magnetoconductance displayed by the 
doped PBTTT thin film is attributed to weak localization!'!, as both 
the magnitude and the curvature of the data can be fit using the well- 
established Hikami-Larkin—Nagaoka weak localization model”. This 
requires only the characteristic magnetic field at which the matrix ele- 
ment responsible for backscattering loses its phase as the fitting param- 
eter. This fitting allows the phase-coherent length, ,, to be determined 
at various temperatures (Fig. 4f; see also Supplementary Information 
section 1.7). Figure 4f compares the phase-coherent lengths of three 
doped PBTTT thin films (conventional F4TCNQ doping!!, orange; 
anion-exchange doping with EMIM-TFSI, light blue, and with Li- TFSI, 
red) and demonstrates that the phase-coherent length becomes longer 
as the doping level increases. 

This approach to doping substantially increases the doping level, 
which in turn promotes two-dimensional, coherent carrier transport, 
and also imparts excellent thermal stability. Because the additional ani- 
ons replace the initial dopant in the film and remain within the polymer 
network, they should have an effect on the physicochemical proper- 
ties of the material®”*. Thus, we also assessed the thermal stability and 
durability of doped PBT TT thin films. To do so, we evaluated the ther- 
mal durability of the doped thin films through conductivity measure- 
ments, based on the proportion of the original conductivity retained 
after annealing in an argon-purged glove box for 10 min at 120°C or 
160°C (Extended Data Fig. 10 and Supplementary Information section 
1.8). The conductivity of an FATCNQ-doped thin film was reduced by 
three orders of magnitude after heating at 160°C, but this change was 
dramatically suppressed in the case of doping with hydrophobic closed- 
shell anions. We note that further improvements to thermal durability 
could therefore be achieved by tuning the physicochemical properties 
of the additional anion. 

Our results show that the molecular doping of polymeric semicon- 
ductors via anion exchange increases both the doping level and the 
thermal durability of the polymer thin film. This process uses ionic 
interactions to build new host-guest structures and increase the dop- 
ing levels, thus overcoming the limitations based on the redox poten- 
tial, and could potentially be extended to electron doping—that is, to 
cation-exchange doping. This technique suggests opportunities for 
the storage, transport and conversion of functional molecules within 
solid-state conjugated materials. The remarkably high doping concen- 
trations and enhanced conductivity demonstrated here should also 
improve our understanding of charge-transport physics, as the half- 
filled state in highly crystalline polymeric semiconductors is likely to 
trigger an electronic phase transition. 
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Extended Data Fig. 1 | Molecular structures of the compounds used here. 
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Extended Data Fig. 2 | FTIR spectroscopy to assess residual anions 

in PBTTT thin films. a, FTIR spectra of PBTTT-C14 thin films: black, 
pristine; orange, FETCNQ-doped; light blue, anion-exchange-doped 

with EMIM-TFSI; red, anion-exchange-doped with Li- TFSI; and blue, 
anion-exchange-doped with Li-BOB. b, Magnified FTIR spectra from 
2,050 cm! to 2,250 cm~!. C=N stretching (around 2,190 cm!) appears 
only in the case of the FETCNQ-doped film. c, Magnified FTIR spectra 
from 1,650 cm! to 1,950 cm~!. The doublet peak assigned to the C=O 
stretching mode (around 1,800 cm~') in carbonyl groups in BOB™ (ref. **) 
is generated only in PBTTT films anion-exchange-doped with Li-BOB. 
This suggests that the BOB~ anions were exchanged and incorporated into 
the PBTTT thin film. 
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Extended Data Fig. 3 | The origin of Curie susceptibility in anion- 
exchange-doped PBTTT. a, Temperature-dependent electron spin 
resonance (ESR) spectra obtained from PBTTT thin film anion-exchange- 
doped with EMIM-TFSI. The single Lorentzian ESR spectra are observed 
to follow the Curie law. Note that Curie susceptibility is attributed to 
localized spins either on F4ATCNQ radical anions or on PBTTT radical 
cations. Hall effect measurements indicate that carriers in the highly 
doped PBTTT are likely to undergo delocalized transport, thus producing 
Pauli paramagnetic susceptibility that is negligible compared with the 
Curie effect. Although the g-factors of the PBTTT radical cation and 
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F4TCNQ radical anion are identical, it is reasonable to assume that the 
observed Curie susceptibility originates from localized FATCNQ radical 
anions. We found the experimentally determined spin concentration to be 
much less than the actual carrier concentration in the PBTTT thin film, 
as discussed in the main text. b, ESR spectrum for anion-exchange-doped 
PBTTT, acquired at 4.3 K with the external magnetic field perpendicular 
to the film plane. The result of single Lorentzian fitting is plotted as a 
black curve. c, The effect of temperature (T) on spin susceptibility, as 
determined by double integration of the ESR spectra. 
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Extended Data Fig. 4 | Comparison of absorption spectra with and 
without an initiator dopant. Optical absorption spectra of PBTTT thin 
films after immersion in EMIM-TFSI, with and without F4TCNQ. The 
spectra show that doping occurs only when F4TCNQ is dissolved in the 
ionic liquid. Similarly, a PBTTT thin film immersed in a solution of Li- 
TFSI in n-butyl acetate without F4TCNQ shows no doping. These results 
demonstrate that FATCNQ is necessary to initiate the doping reaction. 
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Extended Data Fig. 5 | Anion-exchange doping with a series of organic 
cations (X*). Absorption spectra are shown following anion-exchange 
doping of PBTTT films with four different organic cations (with Y~ fixed 


as TESI-). 
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Extended Data Fig. 6 | Effects of the concentrations of the initiator 
dopant and additional anion. a, b, Changes in conductivity as a function 
of: initiator dopant (F4TCNQ) concentration (a; red); and additional 
anion (TFSI) concentration following anion-exchange doping (b; red). 
The conductivity of an FATCNQ-doped PBTTT thin film in n-butyl 
acetate (nBA) is also shown as a reference (a; black). Concentration 
evidently has a limited effect by comparison with the salt species. The 
error bars in the conductivity stem from uncertainty in the thickness of 
PBTTT thin films, and represent one standard deviation. 
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Extended Data Fig. 7 | Energy alignment diagram and change in Gibbs 
energy due to anion-exchange doping. a, In a neutral state, the ionization 
potential (Jp) of an organic semiconductor is equal to the edge of its 
semiconductor HOMO band (HOMOgc). b, In conventional molecular 
doping with F4TCNQ, electrons within the HOMO band of the organic 
semiconductor are transferred to the LUMO level (dotted line) of the 
F4TCNQ, such that Ip is close to the LUMO level of the dopant (LUMOp,). 
The resulting donor-acceptor association minimizes the Gibbs free energy 
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(b) F4TCNQ doped state 
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at equilibrium (orange line) such that no further charge transfer occurs. 

c, In anion-exchange doping, additional energy gain reduces the Gibbs 
free energy of the final state (red line), thus promoting the charge-transfer 
reaction. In this case, the Ip of the organic semiconductor exceeds the 
LUMO level of the dopant approximately by the energy gain resulting 
from anion exchange (A,x). We determined the resulting shift in Ip by 
photoelectron yield spectroscopy to be approximately 0.2 eV (Fig. 3d). 
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Extended Data Fig. 8 | An example of overcoming the limitation of 
redox potential by anion-exchange doping. a. Ultraviolet—visible-near- 
infrared (UV-vis-NIR) spectra of pristine, F4TCNQ-doped, and anion- 
exchange-doped (with Li- TFSI) donor-acceptor copolymer thin films 
based on PDPP-2T-TT-OD, which has a deep HOMO level (—5.5 eV). 

b, Molecular structure of PDPP-2T-TT-OD. c, Energy-level alignment 
diagram for PDPP-2T-TT-OD, F6TCNNQ and F4TCNQ, along with the 
molecular structure of FETCNNQ. Because of the deep HOMO level of 
PDPP-2T-TT-OD, charge transfer is not expected following conventional 
molecular doping with FATCNQ. Anion-exchange doping with Li-TFSI 
results in bleaching of the neutral peak and the appearance of PDPP- 
2T-TT-OD polaron peaks, using F4TCNQ as the initiator dopant. The 
doping level obtained from anion-exchange doping with Li-TFSI is high 
compared with that reported for FETCNNQ doping”, as determined from 
the intensity ratio of the neutral (815 nm) and polaron peaks (1,400 nm) 
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(the polaron/neutral ratio is about 0.1 for FETCNNQ molecular doping 
and about 0.5 for anion-exchange doping with Li-TFSI) d, UV-vis-NIR 
spectra of TCNQ-doped and anion-exchange-doped (with BMIM-TFSI 

or Li-TFSI) PBTTT thin films. A bleaching of neutral absorbance was 
observed only with Li-TFSI. e, Energy-level alignment diagram for PBTTT 
and TCNQ, along with the molecular structure of TCNQ. The LUMO 

level of TCNQ is too shallow (—4.5 eV) to produce ground-state charge 
transfer. Even so, introducing Li-TFSI (that is, anion-exchange doping) 
promotes efficient doping. This presumably occurs because a slight overlap 
of tail states between HOMO and LUMO levels could initiate charge 
transfer between PBTTT and TCNQ, and therefore TCNQ®” is exchanged 
to TFSI-. Overall, control experiments show that the initiator acceptor is 
not necessarily a powerful acceptor, and that efficient molecular doping is 
driven by anion exchange. 
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exchange-doped PBTTT. a, b, Transverse (Hall) voltage values obtained (Ao = o(B) — o(0)) at various temperatures, with B applied perpendicular 


from PBTTT thin films that have been anion-exchange-doped with to the substrate plane. See Supplementary Information for details on fitting 
Li-TFSI (a) and EMIM-TFSI (b) at various temperatures. c, Effect of of the magnetoconductance data. 
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Extended Data Fig. 10 | Thermal durability of doped PBTTT thin films. —_b, KyxCO3; c, BMIM-FAP; d, BMIM-FeCl,; e, BMIM-BF,; f, BMIM-PF.; 
These UV-vis—NIR spectra were obtained from doped PBTTT thin films g, BMIM-TESI, h, Li-PFSI; and i, Li-BOB. j, Ratios of conductivity before 
before and after annealing at the indicated temperatures. a, An FATCNQ- and after annealing at 120°C and 160°C. Details of doping conditions are 
doped PBTTT thin film. b-i, Anion-exchange-doped films with: provided in the Supplementary Information. 
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Increased shear in the North Atlantic upper-level jet 
stream over the past four decades 


Simon H. Lee!, Paul D. Williams!* & Thomas H. A. Frame! 


Earth’s equator-to-pole temperature gradient drives westerly 
mid-latitude jet streams through thermal wind balance’. In the 
upper atmosphere, anthropogenic climate change is strengthening 
this meridional temperature gradient by cooling the polar lower 
stratosphere”? and warming the tropical upper troposphere*®, 
acting to strengthen the upper-level jet stream’. In contrast, in 
the lower atmosphere, Arctic amplification of global warming is 
weakening the meridional temperature gradient®-", acting to 
weaken the upper-level jet stream. Therefore, trends in the speed 
of the upper-level jet stream!!"3 represent a closely balanced tug- 
of-war between two competing effects at different altitudes!*. It 
is possible to isolate one of the competing effects by analysing the 
vertical shear—the change in wind speed with height—instead of the 
wind speed, but this approach has not previously been taken. Here 
we show that, although the zonal wind speed in the North Atlantic 
polar jet stream at 250 hectopascals has not changed since the start 
of the observational satellite era in 1979, the vertical shear has 
increased by 15 per cent (with a range of 11-17 per cent) according 
to three different reanalysis datasets'>-!”. We further show that this 
trend is attributable to the thermal wind response to the enhanced 
upper-level meridional temperature gradient. Our results indicate 
that climate change may be having a larger impact on the North 
Atlantic jet stream than previously thought. The increased vertical 
shear is consistent with the intensification of shear-driven clear-air 
turbulence expected from climate change'®°, which will affect 
aviation in the busy transatlantic flight corridor by creating a more 
turbulent flying environment for aircraft. We conclude that the 
effects of climate change and variability on the upper-level jet stream 
are being partly obscured by the traditional focus on wind speed 
rather than wind shear. 


a ERA-Interim b 


NCEP/NCA c 


In the Northern and Southern hemispheres, the mid-latitude baro- 
clinic zone of the atmosphere is associated with a planetary-scale 
meridional temperature gradient between the equator and the pole. 
This temperature gradient generates westerly winds that strengthen 
with height—vertical wind shear—as a consequence of thermal wind 
balance’. Using pressure as a vertical coordinate, the vertical shear in 
the zonal wind, —0u/0p, is related to the meridional temperature 
gradient, OT /Oy, by the thermal wind balance equation: 


ae (1) 


where R is the specific gas constant for dry air, fis the Coriolis parame- 
ter, p is pressure, and y is northward distance. Aloft, the strong westerly 
winds generated by thermal wind balance form the polar (or mid- 
latitude) jet stream, the speed of which is typically maximised near the 
tropopause, where the sign of the meridional temperature gradient (and 
thus the sign of the vertical shear) reverses. The polar jet stream is often 
described as eddy-driven, because eddies are required to support non- 
zero surface westerlies. It is distinct from the subtropical jet stream, 
which is primarily caused by poleward transport of angular momen- 
tum in the Hadley cell*!. The polar jet stream influences mid-latitude 
weather systems, with the storm tracks being essentially a surface 
expression of the jet stream”. It also has an important role in commer- 
cial aircraft operations, partly because it creates strong headwinds and 
tailwinds on busy mid-latitude flight routes”, but also because clear-air 
turbulence is generated by the associated intense vertical wind shear. 
The mid-latitude meridional temperature gradients are being 
modified by anthropogenic climate change”*, and the jet streams 
are expected to adjust in response**-°. In the lower troposphere of 
the Northern Hemisphere, Arctic amplification caused primarily by 
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Fig. 1 | Annual-mean temperature trends in the North Atlantic at 
250 hPa over the period 1979-2017. Linear trends are calculated using 
ordinary least-squares regression from the ERA-Interim (a), NCEP/NCAR 


(b) and JRA-55 (c) reanalysis datasets. Significant trends are indicated by 
stippling (two-tailed t-test; P < 0.05; n = 39). 
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Fig. 2 | Vertical profiles of trends in the annual-mean north-south 
temperature difference across the North Atlantic over the period 
1979-2017. Linear trends are calculated from the ERA-Interim (a), 
NCEP/NCAR (b) and JRA-55 (c) reanalysis datasets. Red and blue 


lapse-rate feedbacks”° is weakening the meridional temperature gradient 
and polar jet stream®!. In contrast, in the upper troposphere and lower 
stratosphere, the meridional temperature gradient is strengthening 
because of the combined effects of polar lower-stratospheric cooling 
and tropical upper-tropospheric warming, the latter caused by water 
vapour feedbacks releasing additional latent heat and reducing the 
lapse rate’. The vertically integrated thermal wind response is a tug- 
of-war between these two competing effects, with Arctic amplifica- 
tion acting to decrease the wind speed in the upper troposphere and 
lower stratosphere, but polar lower-stratospheric cooling and tropical 
upper-tropospheric warming acting to increase it. These competing 
influences suggest that upper-level trends in the jet stream may be 
better discerned through changes in vertical wind shear rather than 
absolute wind speed. 

Here we analyse historic trends in the upper-level vertical wind shear 
in the North Atlantic region. In future climate projections, the preva- 
lence of clear-air turbulence at typical aircraft cruising altitudes increases 
more here than anywhere else globally”°. We use data from the ERA- 
Interim reanalysis at 0.75° horizontal resolution!®, the NCEP/NCAR 
reanalysis at 2.5° horizontal resolution’’, and the JRA-55 reanalysis at 
1.25° horizontal resolution!”. The use of three independently produced 
reanalysis datasets allows us to quantify the sensitivity of our results to 
uncertainties in the state of the atmosphere. We take six-hourly data 
from the years 1979-2017 inclusive. We restrict the temporal coverage 
to the satellite era, because the sparsity of upper-level wind observations 
over the North Atlantic before 1979 substantially increases uncertainty 
in reanalysis datasets*’. We consider data within the region defined by 
30-70° N and 10-80° W. This latitudinal range is chosen to include 
the polar jet stream (and the busy transatlantic flight corridor) while 
excluding the subtropical jet stream. We focus on the shear at a pres- 
sure altitude of 250 hPa (millibars), corresponding to the climatological 
core of the polar jet stream, and equating to a typical aircraft cruising 
altitude of around 34,000 feet. 

We begin by analysing annual-mean upper-level temperature trends. 
As shown in Fig. 1, all three reanalysis datasets indicate a strengthening 
of the mid-latitude meridional temperature gradient at 250 hPa. The 
250 hPa pressure surface evidently intersects the tropopause at around 
50°-60° N, with lower-stratospheric cooling on the poleward side and 
upper-tropospheric warming on the equatorward side. The upper- 
tropospheric warming trend is slightly stronger in ERA-Interim and 
JRA-55, and the lower-stratospheric cooling trend is slightly stronger 
in NCEP/NCAR. Despite these minor differences, the spatial patterns 
and magnitudes of the temperature trends are broadly consistent across 
the datasets. Unlike the warming trends, the cooling trends are gener- 
ally not statistically significant (except near Iceland in NCEP/NCAR), 
probably because of large inter-annual variability associated with the 
northern hemispheric circumpolar vortex”®. 
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colours represent positive and negative trends, respectively. Error bars 
represent the 95% confidence intervals in the slope of the ordinary 
least-squares regression (two-tailed t-test; n = 39). 


To assess the vertical structure of the trends in the meridional 
temperature gradient, we calculate a bulk north-south temperature 
difference across the North Atlantic using a two-box method. On each 
pressure surface, annual-mean temperatures are averaged within a 
subpolar box (50°-70° N, 10°-80° W) and then subtracted from those 
averaged within a subtropical box (30°-50° N, 10°-80° W). This calcu- 
lation yields a zonal-mean bulk meridional temperature difference, and 
the trends in this quantity are shown in Fig. 2. There is good agreement 
between the reanalysis datasets, with all three showing a statistically 
significant weakening of the meridional temperature gradient in the 
lower atmosphere and a statistically significant strengthening in the 
upper atmosphere. There is a transition between these two influences 
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Fig. 3 | Time series of annual-mean wind characteristics in the North 
Atlantic at 250 hPa over the period 1979-2017. a, Vertical shear in 

the zonal wind. b, Zonal wind speed. Data are presented from the ERA- 
Interim, NCEP/NCAR and JRA-55 reanalysis datasets. Also shown are the 
mean of the three reanalysis datasets and the linear trend in the mean. 
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Fig. 4 | Annual-mean trends in vertical shear in zonal wind in the North 
Atlantic at 250 hPa over the period 1979-2017. a-c, Actual vertical wind 
shear trends calculated from the wind field. d—-f, Expected vertical 

wind shear trends calculated from the temperature field using thermal 
wind balance. Linear trends are calculated using ordinary least-squares 
regression from the ERA-Interim (a, d), NCEP/NCAR (b, e) and 


at around 450 hPa. There are some minor discrepancies, with NCEP/ 
NCAR showing both a faster weakening of the meridional tempera- 
ture gradient in the lower atmosphere and a faster strengthening aloft. 
At 250 hPa, however, all three reanalysis datasets show a statistically 
significant strengthening of the temperature difference by nearly 
0.2 K per decade, consistent with Fig. 1. 

To assess the impacts of the increasing meridional temperature 
gradient at 250 hPa on the atmospheric circulation, time series of the 
annual-mean vertical shear in zonal wind, averaged over the region 
30°-70° N and 10°-80° W, are shown in Fig. 3a. All three reanalysis 
datasets are clearly in good agreement with respect to the inter-annual 
variability and the superimposed upward trend. The multi-reanalysis 
ensemble-mean vertical wind shear shows a statistically significant 
(P = 0.03) increase of 15% (0.07 ms! (100 hPa)! per decade) over 
the 39-year period. The individual increases range from 11% in JRA-55 
(0.06 m s~1 (100 hPa)“ per decade, P = 0.09) to 17% in ERA-Interim 
(0.08 ms! (100 hPa)! per decade, P = 0.02) and 17% in NCEP/ 
NCAR (0.08 m s~! (100 hPa)~! per decade, P = 0.01). In contrast, as 
shown in Fig. 3b, the annual-mean zonal wind speed averaged over 
the same region at 250 hPa has not significantly changed in any of the 
three datasets (P = 0.72 for the slope of the ensemble-mean trend). It 
is notable that there is less spread between the three datasets for the 
shear than the speed; this may be because the speed is biased low in 
NCEP/NCAR because of the relatively coarse resolution compared to 
ERA-Interim and JRA-55, whereas this bias evidently disappears when 
vertical differences are taken to compute the shear. 

The increased shear without increased speed shown for the upper 
atmosphere in Fig. 3 indicates that the weaker meridional temperature 


JRA-55 (c, f) reanalysis datasets. Significant trends are indicated 

by stippling (two-tailed t-test; P < 0.05; n = 39). To indicate the 
climatological jet stream position, the 1979-2017 annual-mean zonal 
wind at 250 hPa in each reanalysis dataset is also shown (black contours 
every5ms_’). 


gradient (and weaker vertical wind shear) in the lower troposphere is 
masking the stronger meridional temperature gradient (and stronger 
vertical wind shear) in the upper troposphere and lower stratosphere, 
through a large degree of cancellation in the vertically integrated 
thermal wind. We illustrate this effect by showing vertical profiles of 
trends in shear and speed throughout the depth of the troposphere in 
Extended Data Fig. 1. The shear is strengthening within the jet core 
as well as throughout the broader region influenced by the jet stream 
(Extended Data Fig. 2) and the trends are not attributable to a shift in 
the annual-mean latitude of the jet core (Extended Data Fig. 3). 

To relate trends in the meridional temperature gradient to trends in 
the vertical shear, we invoke the time derivative of the thermal wind 
balance equation (1): 


O Ou 
Ot Op 


R O OT 
fp Ot Oy 


We calculate both sides of this equation independently at each grid- 
point, as a measure of the extent to which the vertical wind shear 
changes are attributable to the local thermal wind response to the 
meridional temperature gradient changes. The time derivatives are 
evaluated as the linear trends over the period 1979-2017, calculated by 
applying ordinary least-squares regression to annual-mean values of 
Ou/Opand OT /Oy at each grid-point on the 250 hPa pressure surface. 
Maps of the left side of equation (2)—the directly calculated vertical 
wind shear trend, produced by differencing the wind fields at the two 
adjacent pressure levels—are shown in Fig. 4a—c. Maps of the right side 
of equation (2)—the expected vertical wind shear trend, produced by 


(2) 
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using the temperature field and assuming thermal wind balance—are 
shown in Fig. 4d-f. There is a clear trend towards stronger vertical shear 
at 250 hPa over almost the entire North Atlantic domain in all three 
reanalysis datasets. The trend is statistically significant in the core of 
the climatological jet stream and on the poleward flank. We note the 
similarity in spatial patterns between these observed vertical wind shear 
increases and future projections of increased clear-air turbulence'®*””, 
The good agreement between the left and right sides of equation (2), 
in terms of both the spatial patterns (the pattern correlation coefficients 
are r > 0.70 in all three datasets) and magnitudes, confirms that the 
vertical wind shear trends are indeed largely attributable to the response 
of the thermal wind to the meridional temperature gradient trends. The 
small discrepancies are presumably attributable to the numerical finite 
differences used to estimate the derivatives, as well as to weak ageo- 
strophic and non-hydrostatic effects. 

In summary, we have identified the first observationally based 
evidence of increased vertical wind shear in the North Atlantic upper- 
level jet stream over the satellite era (1979-2017). The increase of 15% 
(with a range of 11%-17%) is statistically significant, is present in three 
independently produced reanalysis datasets, and is attributable to the 
thermal wind response to the strengthening upper-level meridional 
temperature gradient. The stronger shear is consistent with the inten- 
sification of clear-air turbulence expected from climate change!®-?°, 
because clear-air turbulence is generated by strong vertical wind shear 
(which means small Richardson number; we note that a 15% shear 
increase implies roughly a 30% Richardson number decrease, because 
of their inverse square relationship). In contrast to the large increase in 
vertical wind shear, we find that the zonal wind speed has not changed, 
consistent with previous studies!!!” The explanation for this effect is 
that, in the vertically integrated thermal wind balance equation, the 
weaker meridional temperature gradient and weaker vertical wind 
shear in the lower troposphere are mostly offsetting the stronger 
meridional temperature gradient and stronger vertical wind shear 
aloft. Increased vertical wind shear has important implications, not 
only for clear-air turbulence and its impacts on aviation, but also for the 
turbulent mixing of atmospheric constituents across the tropopause”, 
with potentially important consequences for large-scale atmospheric 
thermodynamics and dynamics*°. 

We conclude that the effects of climate change and variability on 
the upper-level jet stream are being partially obscured by the tradi- 
tional focus on wind speed rather than wind shear. We suggest that 
climate-modelling studies into the response of the jet streams to cli- 
mate change should therefore include consideration of the vertical 
shear as well as the speed. We anticipate that inter-model differences 
in upper-level vertical wind shear trends will have a clear interpretation 
in terms of different upper-level temperature trends. On the other hand, 
inter-model differences in upper-level wind speed trends may be more 
difficult to interpret, because of different balances in the competition 
between temperature trends at upper and lower levels. 
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METHODS 


The North Atlantic region was chosen for this study partly because it is the world’s 
busiest oceanic flight corridor. Owing to the zonally extended nature of the polar 
jet stream in this region, transatlantic flights are typically affected by the strength 
and position of the jet stream throughout their entire flight paths. The effects of 
the jet stream on aircraft include headwinds, tailwinds and clear-air turbulence. A 
further reason for choosing the North Atlantic is that—unlike the North Pacific—it 
exhibits separate polar and subtropical jet streams, allowing an analysis of the polar 
jet stream exclusively. 

We used pressure-level zonal wind and temperature data from the ERA-Interim, 
NCEP/NCAR and JRA-55 reanalysis datasets at six-hourly analysis intervals from 
1 January 1979 to 31 December 2017 inclusive, giving 39 full years of data. All 
datasets were used on a standard latitude-longitude grid (0.75° for ERA-Interim, 
2.5° for NCEP/NCAR and 1.25° for JRA-55). Trends were calculated using ordi- 
nary least-squares regression, and statistical significance was assessed at the 95% 
confidence level (P < 0.05) according to a two-tailed t-test. The effect of temporal 
autocorrelation on statistical significance was tested in the computed annual-mean 
data and found to be negligible. Percentage changes were calculated using the 
values of the fitted linear trend lines in 1979 and 2017. 

To calculate the two-box zonal-mean bulk meridional temperature difference, 
we first averaged the annual-mean temperature in a subtropical box (30°-50° N, 
10°-80° W) and a subpolar box (50°-70° N, 10°-80° W), with a cosine(latitude) 
weighting factor to account for the convergence of grid points at high latitudes. 
The latitudinal bounds of these boxes were chosen to be approximately either side 
of the climatological annual-mean jet stream latitude in the North Atlantic. We 
then found the meridional temperature difference across the North Atlantic by 
subtracting the subtropical box temperature from the subpolar box temperature. 

The jet stream was analysed in the North Atlantic region (10°-80° W, 
30°-70° N). The annual-mean regional-mean 250 hPa vertical shear in zonal wind 
was calculated by taking a centred vertical finite difference using the annual-mean 
zonal winds at 300 and 200 hPa: 

Ou _, u(200 hPa) — u(300 hPa) 
~ 100 hPa 


(3) 


() 
P \asonpa 


We also calculated trends in the annual-mean regional-mean (area-weighted) zonal 
wind speed at 250 hPa over the North Atlantic region. Vertical profiles of vertical 
shear trends were calculated by taking centred finite differences at intervals of 
50 hPa for ERA-Interim and JRA-55, and from neighbouring pressure levels in 
NCEP/NCAR (owing to the spacing of available pressure-level data). 

The annual-mean regional-maximum vertical shear was calculated by a similar 
centred-difference method: we first subtracted the zonal wind at 300 hPa from the 
zonal wind at 200 hPa, and we then found the maximum value within the North 
Atlantic region at each six-hourly interval, before averaging the maximum values 
annually. For the annual-mean regional-maximum zonal wind speed, we found the 
maximum zonal wind speed at 250 hPa within the North Atlantic region at each 
six-hourly interval, before averaging annually. In both cases, the latitude at which 
the maximum occurred was stored. 

When the calculations in Fig. 3 are repeated using the annual-mean regional- 
maximum vertical shear, instead of the annual-mean regional-mean vertical shear, 
a statistically significant ensemble-mean increase of 11% (P < 0.01) in the shear 
is found. The individual increases are 10% in ERA-Interim (P < 0.01), 18% in 
NCEP/NCAR (P < 0.01), and 7% in JRA-55 (P < 0.01) (Extended Data Fig. 2). 
These results confirm that the shear is strengthening within the jet core as well as 
throughout the broader region influenced by the jet stream. The trends are not 
attributable to a shift in the annual-mean latitude of the jet core, which shows no 
statistically significant trend over the period (Extended Data Fig. 3). 

We used the time derivative of the thermal wind balance equation to relate 
linear trends in the meridional temperature gradient to linear trends in the vertical 
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wind shear. At 250 hPa, we calculated trends in the annual-mean values of 0u/Op 
(using the centred finite difference method outlined above) and OT /Oy. The agree- 
ment between the two was assessed through Pearson's correlation coefficient using 
an area-weighted pattern correlation. 

According to thermal wind balance, the trend in the zonal wind speed in the 
upper troposphere and lower stratosphere is given by the vertical integral of 
equation (2). This vertical integral is performed throughout the depth of the free 
troposphere, starting from the top of the planetary boundary layer. Temperature 
gradients in the lower troposphere are included in the integral, and therefore Arctic 
amplification at low levels is able to influence the wind speed at upper levels. For 
example, written in equation form, we have: 


450 hPa 
Ou(250hPa) _ R O0|OT 


Ot fp Ot| Oy 
Po 


250 hPa Roaler 
dp + ala dp +0 (4) 
450 hPa fp : y 


where po is the pressure at the top of the planetary boundary layer. Here, the free 
troposphere has been divided into two layers at 450 hPa, by reference to Fig. 2. 
The lower boundary term Ou(p,) /0t arising from the vertical integration has 
been neglected in equation (4), because the zonal wind speed in the lower 
troposphere has no statistically significant trend in any of the reanalysis datasets, 
as shown in Extended Data Fig. 1d—f. Our study shows that, on the right-hand 
side of equation (4), the first integral (which includes the weakening low-level 
temperature gradient from Arctic amplification) and the second integral (which 
includes the strengthening upper-level temperature gradient) are essentially 
equal and opposite when averaged over the North Atlantic region, thus largely 
cancelling out and leaving no statistically significant trend in the upper-level 
speed. 


Data availability 
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Interim and JRA-55 reanalysis data may be obtained from the Research Data 
Archive at the National Center for Atmospheric Research (NCAR), Computational 
and Information Systems Laboratory, Boulder, Colorado, USA (https://doi.org/ 
10.5065/D6CR5RD9 and https://doi.org/10.5065/D6HH6H41, respectively). 
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Extended Data Fig. 1 | Vertical profiles of annual-mean trends in NCEP/NCAR (b, e) and JRA-55 (c, f) reanalysis datasets. Red and blue 
wind characteristics in the North Atlantic over the period 1979-2017. colours represent positive and negative trends, respectively. Error bars 


a-c, Trends in the vertical shear in the zonal wind. d-f, Trends in the zonal represent the 95% confidence intervals in the slope of the ordinary least- 
wind speed. Linear trends are calculated from the ERA-Interim (a, d), squares regression (two-tailed t-test; n = 39). 
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Extended Data Fig. 2 | Annual-mean regional-maximum six-hourly 
vertical shear in zonal wind in the North Atlantic at 250 hPa over the 
period 1979-2017. Data are presented from the ERA-Interim, NCEP/ 
NCAR and JRA-55 reanalysis datasets. Also shown are the mean of the 
three reanalysis datasets and the linear trend in the mean. 
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Extended Data Fig. 3 | Annual-mean latitude of the core of the polar 
jet stream in the North Atlantic at 250 hPa over the period 1979-2017. 
a, Annual-mean latitude of the regional-maximum six-hourly vertical 
shear in zonal wind. b, Annual-mean latitude of the regional-maximum 
six-hourly zonal wind speed. Data are presented from the ERA-Interim, 
NCEP/NCAR and JRA-55 reanalysis datasets. Also shown are the mean 
of the three reanalysis datasets and the linear trend in the mean, which 
has a statistically insignificant slope of -0.1° per decade (two-tailed t-test; 
P = 0.54; n = 39) (a) and 0.01° per decade (two-tailed t-test; P = 0.76; 

n = 39) (b). 
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Seismic velocities of CaSiO3 perovskite can explain 
LLSVPs in Earth’s lower mantle 


A. R. Thomson!*, W. A. Crichton?, J. P. Brodholt!?, I. G. Wood!, N. C. Siersch4, J. M. R. Muir?, D. P. Dobson! & S. A. Hunt! 


Seismology records the presence of various heterogeneities 
throughout the lower mantle’”, but the origins of these signals— 
whether thermal or chemical—remain uncertain, and therefore 
much of the information that they hold about the nature of the 
deep Earth is obscured. Accurate interpretation of observed seismic 
velocities requires knowledge of the seismic properties of all of 
Earth’s possible mineral components. Calcium silicate (CaSiO3) 
perovskite is believed to be the third most abundant mineral 
throughout the lower mantle. Here we simultaneously measure 
the crystal structure and the shear-wave and compressional-wave 
velocities of samples of CaSiO3 perovskite, and provide direct 
constraints on the adiabatic bulk and shear moduli of this material. 
We observe that incorporation of titanium into CaSiO; perovskite 
stabilizes the tetragonal structure at higher temperatures, and 
that the material’s shear modulus is substantially lower than is 
predicted by computations** or thermodynamic datasets®. When 
combined with literature data and extrapolated, our results suggest 
that subducted oceanic crust will be visible as low-seismic-velocity 
anomalies throughout the lower mantle. In particular, we show 
that large low-shear-velocity provinces (LLSVPs) are consistent 
with moderate enrichment of recycled oceanic crust, and mid- 
mantle discontinuities can be explained by a tetragonal-cubic phase 
transition in Ti-bearing CaSiO3 perovskite. 

The lower mantle is vast, extending from the seismic discontinuity 
observed at approximately 660 km depth to the core-mantle boundary 
(CMB) at a depth of about 2,890 km. Tomographic images demon- 
strate that despite a smooth variation of compressional-wave velocity, 
shear-wave velocity and density (respectively vp, vs and p) in 1D veloc- 
ity models, the lower mantle is heterogeneous and regularly refertilized 
by subducting slabs”®. Sluggish diffusive re-equilibration and incom- 
plete mechanical mixing’ mean that large-scale patterns of mantle con- 
vection may be directly observed via tomographic velocity anomalies 
and/or the distribution of seismic scatterers. Identifying the causes 
of heterogeneities requires accurate mineralogical models of Earth's 
mantle to facilitate comparisons between geophysical observations and 
predicted seismic velocities. However, a major uncertainty in many 
models!®” has been the influence of CaSiO3 perovskite (Ca-Pv, here 
corresponding to Ca[Si,Ti(;_,)]O3) on velocity, despite the widespread 
expectation that it is the lower mantle’s third most abundant phase, 
comprising 5-10 vol.% and 24-29 vol.% of peridotitic!” and basaltic’? 
assemblages, respectively. 

Uncertainties stem from a sparsity of reliable measurements of 
Ca-Pv’s physical properties, which are technically challenging because 
CaSiO; is unrecoverable'’, undergoing spontaneous amorphization at 
room temperature during decompression. The widely used thermo- 
dynamic model of Stixrude et al.° predicts that the seismic velocity of 
Ca-Pv is substantially higher than in average one-dimensional pro- 
files such as the Preliminary Reference Earth Model (PREM)">, and 
therefore low-velocity anomalies are difficult to explain using recycled 
crustal material. Although this is the widely adopted view, there is 
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Fig. 1 | Compressional- and shear-wave velocities of cubic CaSiO; 
perovskite from this and previous studies. a, Compressional-wave 
velocity and b, shear-wave velocity of CaSiO3 perovskite predicted in 
this and previous* ©!” studies throughout the mantle. Individual 
experimental measurements are shown with symbols, coloured by 
temperature (key at top; white symbols are data collected at 300 K). 

All velocity curves are extracted along a 1,500 K mantle adiabat. Thick 
coloured curve (and 95% confidence interval in grey) represents 

the velocity of cubic Ca-Pv based on finite-strain modelling (this study). 
Bold dashed line is the PREM velocity profile. 


1Department of Earth Sciences, University College London, London, UK. ESRF — The European Synchrotron, Grenoble, France. *Centre for Earth Evolution and Dynamics, University of Oslo, Oslo, 
Norway. “Bayeriches Geoinstitut, University of Bayreuth, Bayreuth, Germany. ®School of Earth and Environment, University of Leeds, Leeds, UK. *e-mail: a.r.thomson@ucl.ac.uk 


29 AUGUST 2019 | VOL 572 | NATURE | 643 


LETTER 


currently no consensus on the seismic properties of Ca-Pv. Existing 
high-temperature calculations*- suggest that the velocity of Ca-Pv 
might be either slightly lower or much higher than in PREM” (Fig. 1). 
By contrast, room-temperature experiments!®!” have measured a shear 
velocity of Ca-Pv that is at least 7% lower than the lowest computa- 
tional estimates; however, there is difficulty extrapolating these to high- 
temperature conditions due to intervening phase transformations of the 
Ca-Pv structure*!*!°, Very recently, high-temperature experimental 
velocity measurements that are also lower than all computational values 
have been reported”, although this study did not consider extrapola- 
tions of Ca-Pv's velocity throughout the deep mantle, leaving Ca-Pv's 
contribution in generating lower-mantle signatures unresolved. Here 
we report synchrotron-based high-pressure (P), high-temperature 
(T), experiments that simultaneously measure the crystal structure 
and seismic velocities (vp and vs) of Ca[Si,Ti;_,j]03 compositions 
(x = 0.6 and 1) that bracket the range of inclusions found in natural 
superdeep diamonds and are expected in lower-mantle assemblages”’. 
Combining our new data with ab initio calculations and literature data, 
we directly address the influence of crystallographic phase transitions 
on the velocity of Ca-Pv and apply our results to provide a new under- 
standing of Ca-Pv’s geophysical signature throughout the lower mantle 
(see Methods)!®!772, 

In line with expectations from previous experiments’, in situ 
diffraction confirms that Ca-Pv is cubic at high temperature, and 
undergoes one or more structural distortions upon cooling to room 
temperature (Fig. 2a). Refinement and indexing of diffraction patterns 
reveals that endmember CaSiO3; transforms on cooling from cubic 
(Pm3m) at high temperatures into tetragonal (I4/mcm) perovskite 
between 380 K and 420 K at about 12 GPa (Fig. 2c). This phase transi- 
tion is identified by the nonlinear splitting upon cooling (observed as 
broadening) of all diffraction peaks, except for those with indices hhh 
(that is, 111 and 222), from the cubic aristotype unit cell (a = 3.5 A, 
Fig. 2b and Extended Data Fig. le). Additionally, weak superlattice 
reflections at d-spacings (in A) of approximately 2.11, 1.61, 1.07 and 
0.98 (Extended Data Fig. 2), which uniquely identify the I4/mcm struc- 
ture, were observed below about 420 K. Titanium incorporation (sim- 
ilarly to aluminium”) increases the upper stability limit of tetragonal 
Ca-Pv considerably, here by nearly 800 K. We find that Ca[Sig ¢Tip 4JO3 
takes the Fm3m space group at high temperature (Extended Data 
Figs. 1, 3, 4), possessing a double perovskite unit cell, with partial Si:Ti 
cation ordering that is apparently maintained throughout cooling. The 
cubic-tetragonal (Fm3m-I4/m) transition in Ca[Sio¢Tio.4]O3 is 
observed at approximately 1,200 K (Extended Data Fig. 1). Upon fur- 
ther cooling, a subsequent symmetry distortion, thought to be to P2;/c, 
is observed at about 700 K. These observations provide very strong 
evidence that Ca-Pv follows the same structural transitions on cooling 
as CaTiO3 (see equations below”), with the apparent reductions in 
symmetry from [4/mcm and Pbnm to their I4/m and P2,/c subgroups 
being a consequence of cation ordering: 


Pm3m—*S 14/mcm—CaSiO, (~12 GPa) 


Fm3m—S 14/m— P2,/c—Cal[Siy ¢Tiy 4]O3 (~12 GPa) 


Pm3m—SS14/mcem SS Pbnm—CaTiO; (0 GPa) 


Acoustic velocities, determined simultaneously with synchrotron X-ray 
diffraction using pulse-echo ultrasonic interferometry, demonstrate 
that the observed phase transitions of Ca-Pv are associated with large 
elastic anomalies. CaSiO3 and Ca[Sio.¢Tio.4]O3 samples undergo vp and 
vs reductions of 4-14% and 8-20%, respectively, owing to their cubic- 
tetragonal transitions (Fig. 3). Continued cooling of Ca[Sig ¢Tio.4]O3 
into its presumed monoclinic structure sees the velocities increase near 
ambient temperature. The acoustic ‘shear-strengthening’ with temper- 
ature that we observe in tetragonal Ca-Pv is also reported for polycrys- 
talline BaTiO; samples”*. Such behaviour is thought to result from a 
temperature-activated twin-domain-wall process that also causes high 
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Fig. 2 | X-ray diffraction patterns demonstrating the cubic-tetragonal 
phase transition in CaSiO3; perovskite. a, Rietveld refined X-ray 
diffraction patterns collected at about 12 GPa and 1,273 K (red, cubic) or 
300 K (blue, tetragonal), with cubic CaSiO; peaks labelled by [hkl] and 
tick marks for other cell components. b, Normalized full-width at half- 
maximum (FWHM) of selected diffraction peaks (see key) of CaSiO3 
perovskite as a function of temperature. c, Refined lattice parameters of 
CaSiO3 perovskite sample as a function of experimental temperature with 
20 uncertainties. 


acoustic attenuation”®. Although we cannot rigorously measure acoustic 
attenuation, we do observe a diminution in the intensity of reflected 
acoustic waves when samples are tetragonal. Experiments on endmem- 
ber CaSiO3 demonstrate a modest reduction in vp and vs across the 
cubic-tetragonal transition at T < 450 K; however, the relatively small 
decrease observed is likely to continue at sub-ambient temperatures that 
could not be examined in this study. We suggest that CaSiOs, if cooled 
further, probably undergoes similar magnitudes of velocity reduction to 
those observed for Ca[Sig ¢Tig,4]O3. Absolute acoustic velocities meas- 
ured for CaSiO; are lower than computational predictions, but vp and 
Vs in this study agree extremely well with previous experimental meas- 
urements made at room temperature'®!”°. It is only with increasing 
temperature that our results diverge from previous experimental data”’. 
The excellent room-temperature agreement leads us to conclude that 
previous calculations must have overestimated the velocities, specifi- 
cally the shear modulus (G), of Ca-Pv (as discussed below). We also 


observe that the temperature dependences of velocities (+5) in cubic 
v 


Ca-Pv are 1.5-3 times larger than those experimentally observed for 
other mantle silicates?”. However, the temperature dependence of the 
elastic moduli in our experiments (dKs/dT and dG/dT are both about 
—0.027 to —0.03 GPa K~!) match those observed for cubic [Ca,Sr] TiO3 
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Fig. 3 | Acoustic velocities of Ca-Pv samples at high-PT conditions. 

a-d, Compressional-wave (vp; a) and shear-wave (vs; b) velocities, with 
derived shear modulus (G; c) and bulk modulus (Ks; d) of Ca-Pv samples 
measured as a function of temperature at constant press load (about 

12 GPa), for Ca[Sip ¢Tig.4]O3 (blue circles) and CaSiO; samples (orange and 
green circles). Data from “runa” and “runb’” are results from two separate 
experiments on CaSiO; at slightly different pressure. Uncertainties are 

all 2c. Small blue circles are data with a low signal-to-noise ratio, due to 


perovskites (—0.024 to —0.03 GPa K~!)”° and the bulk (Ks), but not the 
shear, modulus in previous experiments on CaSiO; (dKs/dT ~ —0.036 
and dG/dT ~ —0.015 GPa K~!)°, 

Using ab initio molecular dynamics, we have calculated the PT slope 
of the 14/mcm — Pm3m phase transition in CaSiO3 perovskite 
(Methods, Extended Data Fig. 5) in order to apply our results to Earth's 
deep mantle. The calculated slope, approximately 15 K GPa}, is similar 
to results from previous calculations (about 10 K GPa~!)? and experi- 
ments on the 14/mcm — Pm3m transition in SrTiO; (approximately 
18.5 K GPa~!)?®. However, it is much larger than, but still within 
uncertainty of, experimental estimates (<2 K GPa~!)!®?3 for 
CaSiO3. Assuming our PT slope is only shifted in temperature by 
Ti-incorporation, average mid-ocean-ridge basalt (MORB) Ca-Pv 
(~Ca[Sip 9Tio.1]O3, ignoring other chemical components)!3, should 
undergo a cubic — tetragonal transformation at mid-mantle depths. 
In addition, Ca-Pv subducted within slab assemblages, particularly for 
Ti-rich Ca-Pv compositions, may in fact retain the tetragonal structure 
throughout the entire mantle at average temperatures (Extended Data 
Fig. 5). Pure CaSiO3, which is similar to the composition stable in peri- 
dotitic and harzburgitic assemblages, is unlikely to become tetragonal 
in the ambient mantle, but could undergo a cubic-tetragonal transition 
in cold slab assemblages reaching pressures greater than about 90 GPa. 

To evaluate our experimental results in the context of Earth’s lower 
mantle, we have fitted finite-strain equations of state (EoS) for cubic 
and tetragonal CaSiO3 perovskite, using the thermodynamically 
self-consistent Mie-Debye-Griineisen Birch-Murnaghan formalism” 
commonly adopted in mineralogical models®. The narrow pressure 
range of our experiments means additional constraints from literature 
data are required for extrapolations where velocities remain experi- 
mentally unconstrained. All available literature data judged to reliably 
constrain Ca-Pv volume and/or acoustic velocities at high-PT condi- 
tions (Supplementary Table 1) were collated and converted to a com- 
mon pressure scale for joint inversion with new data from this study 
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low amplitude of the buffer rod reflection, and have larger uncertainties. 
Experimental velocity measurements from previous studies are plotted 
as triangles”? (about 12 GPa) and squares” (orange, 12 + 1 GPa; purple, 
15 + 1 GPa). Dashed vertical lines indicate the temperatures of observed 
phase transitions for CaSiO3 (orange) and Ca[Sig ¢Tip.4]O3 (blue) to/from 
cubic (Cub.), tetragonal (Tet.) and/or monoclinic (Mono.) structure. The 
blue temperature interval represents the extent of the first-order [4/m to 
P2\/c transition in Ca[Sio,6Tio.4] O3. 


(see Methods for full fitting procedure). We note that data from recent 
high-PT experiments” are not included, owing to inconsistencies 
within this dataset (Methods). Recovered EoS parameters (see Methods 
for nomenclature) for tetragonal Ca-Pv at 300 K are Vp = 46.10(6) A3 
per formula unit, Ko = 224(4) GPa, Ko’ = 4 (fixed), Go = 107(6) GPa 
and Gp = 1.4(1) (Extended Data Fig. 6a). The EoS for cubic Ca-Pv, 
fitted using high-temperature data lying above the phase transition 
calculated by ab initio methods in this study (Fig. 1, Extended Data 
Fig. 6b, Supplementary Table 2) has the parameters: Vy = 45.57(2) A3, 
Ky = 248(3) GPa, Ko! = 3.6(1), Gy = 107(1) GPa, Go! = 1.66 (fixed, but 
manually varied by £0.22), qo = 1.1(2), yo = 1.67(4), 09 = 771(90) K 
and 77,9 = 3.3 (fixed to approximately 270). Incorporation of our newly 
collected data, compared with only fitting literature data, improves esti- 
mations of Go, Yo, qo and a, owing to the high temperature resolution 
provided by this study, whereas the dominant constraints on Ky and Ko’ 
come from the highest-pressure literature data. The narrow pressure 
range of our velocity measurements mean they mainly constrain the 
shear modulus, but not its pressure derivative Go’. However, literature 
values of Gy from calculations and experiments**”? are highly consist- 
ent and it can be fixed with some confidence, although we have also 
varied it manually to assess its effect on extrapolations. Our approach 
relies on high-precision data from four previous diffraction studies on 
Ca-Pv at high-PT conditions and results in a single EoS that explains 
all data with no outliers at the 30 level. This provides the best option 
to date to investigate Ca-Pv’s velocity at deep mantle conditions and 
offers self-consistent EoS values without apparent reduction in pre- 
dictive capacity throughout the pressures and temperatures relevant 
to Earth’s mantle. 

Our results imply that cubic Ca-Pv’s compressional- and shear- 
velocity profiles are substantially lower than PREM’ (see Fig. 1), 
whereas its bulk sound velocity is virtually indistinguishable from 
PREM! (Extended Data Fig. 7a). We observe Ca-Pv’s velocities, espe- 
cially vs, to be much lower than those predicted from thermodynamic 
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Fig. 4 | Modelled velocity profiles of lower-mantle phase assemblages 
incorporating Ca-Pv based on this study. a, b, Models of MORB (a) and 
harzburgite (b) phase assemblages relative to pyrolite throughout the lower 
mantle. Shown are profiles of compressional velocities, vp (solid curves), 
shear velocities, vs (long dashed curves) and bulk sound velocities, vg 
(short-dashed curves); red curves are calculated velocity profiles along a 
self-consistent 1,500 K adiabat, blue curves are calculated along a 1,000 K 
adiabat. The coloured lines are the velocity when Ca-Pv is cubic, whereas 
the lower bound of the coloured shading is indicative of the velocity 
expected if Ca-Pv forms a tetragonal structure. The vertical dashed lines, 
and corresponding grey shading, mark the depth of the bridgmanite (bm) 
to post-perovskite (ppv) transition in each assemblage. 


datasets®, by previous high-temperature calculations**, or high-PT 
experiments”. In all cases the discrepancy is almost entirely due to the 
much lower shear modulus observed in this study. Indeed, it is evident 
that the shear modulus of Ca-Pv is one of the most critical parameters 
for accurately modelling the velocity of basaltic assemblages throughout 
the lower mantle. A full comparison with results from previous studies 
is provided in Methods. Although titanium-bearing Ca-Pv has not been 
included in finite-strain modelling, our experimental data demonstrate 
that titanium-incorporation will increase Ca-Pv’s velocity by <1 kms7! 
for vp and <0.5 kms“! for vs for Ca[Sig ¢Tig.4]O3. As MORB-derived 
Ca-Pv has}? an approximate composition of Ca[Sig9Tip,,]O3, the 
effect of this titanium content on mantle velocities is expected to be 
smaller. 

Understanding the cause of the LLSVPs remains one of the most 
prominent questions currently pursued by the deep Earth research 
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community*”. Identifying whether they are purely thermal anoma- 
lies or are thermo-chemical piles of recycled or primordial material 
could have profound consequences for our understanding of mantle 
convection. In order to address the question of LLSVP composition, 
we have incorporated our new cubic Ca-Pv EoS in thermodynamic 
models (Methods, Fig. 4), extracting the acoustic velocities of recycled 
basalt and harzburgite assemblages, relative to pyrolite, throughout 
lower-mantle conditions (we are assuming pyrolite is representative of 
average lower-mantle composition). Along a 1,500 K mantle adiabat, 
recycled MORB is predicted to have distinctly lower compressional 
and shear velocities than pyrolite. This result is in stark contrast with 
published thermodynamic datasets, in which predicted MORB assem- 
blages have a velocity 1%-2% higher than the average deep mantle®. 
Thus, despite geochemical observations implying a long-lived reservoir 
of recycled crust in the deep Earth*! and the high density of MORB” 
favouring its accumulation in the LLSVPs’, this was previously con- 
sidered incompatible with observed low velocities unless temperatures 
were extremely hot*?. However, our new results now imply LLSVPs 
are well explained by modest enrichments in recycled oceanic crust, 
without requiring excess temperature anomalies. Compared to the bulk 
mantle composition, which is assumed to be a mixture of approxi- 
mately 80:20 harzburgite: MORB, the vs anomalies of —1.5% and the 
Vs/vp anomaly ratio >2 observed in LLSVPs*° can be reproduced by 
a bimodal mixture of MORB + harzburgite consisting 64% MORB 
at about 100 GPa (vp = —0.77%) or 48% MORB at about 125 GPa 
(vp = —0.36%) if Ca-Pv is cubic. If the LLSVPs are hotter, or if Ca-Pv 
is tetragonal near the CMB, the proportion of basalt required to explain 
the LLSVPs would reduce. Our modelling further implies that the 
above-average velocities that surround the LLSVPs, lying beneath pal- 
aeo-subduction zones that are often considered to be slab graveyards*”, 
could potentially represent depleted assemblages. MORB-enriched 
LLSVPs, surrounded by depleted material, also provide an explanation 
for the anti- or non-correlation of vs and vg (bulk sound velocity) just 
above the CMB if post-perovskite is stable*’. Taken at face value, the 
predicted properties of CaSiO3 suggest that pyrolite assemblages may be 
slightly slower than PREM’*. However, although we are confident that 
our work robustly demonstrates that subducted MORB assemblages 
are slow, the amount of Ca-Pv in pyrolite is smaller and so further 
investigations on the effects of titanium and/or aluminium” in Ca-Pv 
are required to determine whether or not the average velocity of the 
lower mantle remains compatible with a pyrolitic bulk composition. 

The properties of Ca-Pv also provide an explanation for observed 
seismic reflectors throughout the mid-mantle”. In very cold slabs 
following a 1,000 K adiabat, subducted basalts (if Ca-Pv is cubic) 
are predicted to have very similar velocities to pyrolitic assemblages 
on a 1,500 K adiabat and so may be seismically invisible. However, 
if stranded fragments are thermally equilibrated with surrounding 
depleted materials, impedance contrasts with magnitudes of up to 
about +2.8% will be created, making them seismically visible as 
reflectors or as slow regions. Alternatively, if Ca-Pv undergoes the 
cubic-tetragonal phase transition, this may also generate mid-mantle 
anomalies (Extended Data Fig. 5). Although constraining the depth 
and compositional dependence of this phase transition requires 
further studies, it is expected that cold downwelling Ca-Pv is likely 
to experience the cubic to tetragonal transformation somewhere 
beyond 1,000 km depth. Stagnant or delaminated materials in the 
upper-lower mantle boundary region® may undergo the tetragonal- 
cubic transition during thermal equilibration, reducing MORB’s 
shear velocity, which may be the origin of observed mid-mantle 
reflectors”. 
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METHODS 


Starting materials. Starting materials of CaSiO3 and Ca[Sip 6Tio 4]O3 compositions 
were initially prepared by grinding appropriate quantities of high purity CaCO3, 
SiO, and TiO, together in an agate mortar before decarbonation and, in the case 
of CaSiOs3, sintering into crystalline wollastonite, or in the case of Ca[Sip 6Tip 4]O3; 
fusing to a glass. All materials were analysed with scanning electron microscopy 
with energy dispersive X-ray spectroscopy (SEM EDS) to check composition, and 
the CaSiO3 was confirmed as pure wollastonite using X-ray powder diffraction 
(A = 1.7904 A). Ultrasonic experiments require fully dense cylindrical samples 
with parallel polished faces and an aspect ratio (length/diameter) <2. Thus, the 
Ca[Sio.6Tio.4]O3 glass was hot-pressed into Ca-Pv at ~14 GPa and 1,400°C using a 
10/5 multi anvil assembly, and the recovered products manipulated into suitable 
forms for subsequent acoustic experiments. Since CaSiO3 perovskite is unrecover- 
able, undergoing a decomposition to amorphous ‘glass’ with pervasive fracturing at 
room temperature, it is not possible to prepare a similar sample of the endmember 
perovskite composition. Instead samples of walstromite were prepared, which is 
the highest-pressure recoverable polymorph of CaSiOs, by sintering at ~7 GPa and 
1,300°C using a 14/8 multi-anvil assembly. 

Diffraction and ultrasonic experimental methods. X-ray diffraction and pulse- 
echo ultrasonic experiments were performed on beamline ID06-LVP of the ESRF 
synchrotron, where modified 10/5 multi anvil assemblies were employed to allow 
simultaneous measurement of sample diffraction, length and acoustic velocity. 
Samples were placed adjacent to a polished polycrystalline alumina buffer rod on 
one side that reached the cube to which the ultrasonic transducer was attached, and 
an MgO + NaCl mixture serving as soft backing and (for the NaCl) as a pressure 
marker on the other. Thin Au foils were placed at each end of the sample to allow 
sample length to be observed with radiographic imaging, and also between the 
cube and buffer rod to assist with acoustic coupling. This acoustic column was 
encapsulated in a crushable MgO sleeve, and a Re or TiB; + BN furnace, which was 
contained within ZrO) insulation and a Cr:MgO octahedral pressure medium. A 
W-Re thermocouple was used, inserted from the opposite end of the cell assembly 
in an MgO sleeve to monitor temperature adjacent to the sample throughout the 
experiments (Extended Data Fig. 8c). 

Along the path of the X-ray beam, the normal ceramic materials were replaced 
by high transparency amorphous SiBCN(O) windows. Monochromatic synchro- 
tron X-rays (\ = 0.22542 A or 0.2296 A) were used to collect diffraction patterns 
from the sample and pressure markers throughout experiments using two different 
detectors available at ID06. In all experiments the standard Detection Technology 
X-scan 1D detector, which has a fixed 10-Hz integration rate, was used to record 
diffraction patterns every 3.2 s (continuous, with 32x rebin) and every 32 s (on 
increasing and decreasing pressure). Throughout one experiment on CaSiO3 an 
additional Pixirad-8 detector was employed to assist with the identification of 
weak superlattice reflections from the sample. X-ray sample-detector geometry 
was calibrated using Si and/or LaBs NIST standards, and the collected diffrac- 
tion patterns were suitably reduced and analysed using Fit2d*° software. Rietveld 
refinement of selected diffraction patterns was performed using the GSAS software 
package*® (Supplementary Tables 3 and 4). Experimental unit-cell volumes were 
also calculated and used to determine the pressure-volume-temperature (PVT) 
EoS by fitting the position and width of individual diffraction lines from the Ca-Pv 
samples and pressure markers. In this case Ca-Pv volumes are determined using 
the average volume calculated from the 200, 310, 321 and 222 diffraction peaks of 
the sample, while the volumes of NaCl and Au were determined using their 220 
and 310 peaks, respectively (Supplementary Table 5). 

Samples were compressed to target load and initially heated to T > 1,000 K 
to remove stress in the sample that might affect acoustic measurements. In the 
experiments on CaSiO3, the starting material of walstromite was converted into 
Ca-Pv by annealing at constant load (initial pressure ~14 GPa) and 1,200-1,500 K 
for a period of 2-4 h, until all signs of walstromite diffraction peaks were lost. 
Pressure throughout the experiments was determined from the unit-cell volumes 
of NaCl-B1 and/or Au using cross-calibrated high-temperature EoSs*”, and the 
sample length was measured using the standard imaging system installed on ID06- 
LVP. Sample lengths determined with X-ray imaging were checked against those 
measured before and after (in the case of Ca[Sio.¢Tio.4]O3) experiments with a 
digital gauge (1 jm accuracy). Uncertainties in sample lengths are estimated from 
images as +5 pixels, corresponding to +5.4 jum (<2% overall sample length), 
and it is this uncertainty that produces the reported uncertainties in velocities 
(Supplementary Table 6 and Fig. 3). Alongside ultrasonic measurements accompa- 
nied by diffraction and imaging, the crystallographic evolution of the samples was 
specifically investigated using continuous diffraction collected during constant rate 
cooling ramps (25-50 K min!) for CaSiO3 and Ca[Sio,6Tig.4]O3 samples without 
collecting ultrasonic data. 

Acoustic signals, always collected after sample annealing at high T, were trans- 
mitted into and received from the sample assembly using a 10° Y-cut dual mode 
LiNbO; piezoelectric transducer that was fixed to the corner of the ‘acoustic cube’ 


opposite to the sample using Epo-tek 353ND epoxy. A signal generator (Tektronix 
AFG3101C or Keysight 33622A) was used to create acoustic pulses composed of 
three consecutive periods of sine waves with 30-60 MHz frequency and 2.5 or 5 V 
peak-to-peak amplitude, which were passed to the transducer and oscilloscope. 
The resonant frequencies of vp and vs from the transducer crystal were ~50 MHz 
and ~30 MHz, respectively. Received echoes were measured using the same oscil- 
loscope (Tektronix DPO5140 or Keysight DSOS1044A) to record the delay between 
arrival times of compressional- and shear-wave signals at a rate of either 2.5x 10° or 
5 x 10° samples per s. In later experiments, a directional bridge (Keysight 86205A), 
preamplifier (LA020-OS) on the return signal, and external trigger (trigger rate 
2 kHz) were variously used, as the system was continuously developed throughout 
this study. Collection times of individual acoustic spectra ranged from 5 s to 300 s 
with the various systems employed throughout this study. Two-way travel times 
of ultrasonic arrivals were converted into sample velocities using the ‘pulse-echo 
overlap method’ (for reflections from the near and far ends of the sample), which 
was implemented by maximizing absolute values of signal cross-correlation and 
sample lengths measured with X-ray imaging. Predicted reflection coefficients for 
both interfaces (based on R = (Z2 — Z;)/(Z, + Z,), with Z; = pjvj) are both nega- 
tive (approximately —0.025 and —0.25 for Al,O3-Ca-Pv and Ca-Pv-NaCl+MgO 
respectively) at the PT conditions of the experiments, suggesting that no phase shift 
is expected in the acoustic signals. The lack of observed phase shifts and measured 
pulse/echo amplitude ratios were found to be consistent with these expectations, 
which provides assurance that the phase of acoustic arrivals has been correctly 
identified. Measured velocities are reported in Supplementary Table 6. It should be 
noted that two independent experiments were performed to measure the velocity of 
CaSiO3 samples (which agree within uncertainty for vp) during separate visits to the 
ESRE, one employing a Re and the other a TiB2:BN furnace. In the second of these 
experiments on CaSiOs, the shear-wave signal from the sample was not observable 
above noise levels and thus the shear-wave velocity was not determined. These are 
labelled “runa” and “runb” in Fig. 3. Additionally, the ultrasonic data reported for 
Ca[Sio 6Tip.4]O3 above 975 K were collected on heating after annealing, while data 
below 975 K were subsequently collected upon cooling the sample. This results in 
a small pressure difference between the two sets of measurements and explains 
the discontinuity in Fig. 3. 

Structure determination. Refined diffraction data from experiments on CaSiO3 
are consistent with a cubic crystal structure for Ca-Pv at high temperatures. All 
observed diffraction peaks have approximately constant widths at half maximum 
intensity, in line with expectations from the diffractometer geometry, and all 
observed diffraction peaks could be attributed to diffraction from the sample (in 
space group Pm3m) or other cell components (MgO, Au, NaCl, TiBz, Al,O3). Upon 
cooling, the diffraction peaks from the sample undergo substantial nonlinear 
hkl-dependent broadening below ~420 K (Fig. 2 and Extended Data Fig. le). 
Similarly to previous studies'®3, we interpret this observation as the result of a 
cubic-tetragonal transition in CaSiO3. Close inspection of diffraction data from 
the Pixirad-8 detector, which was employed during one experiment, reveals that 
weak superlattice peaks from CaSiO3 (otherwise unexplained by other cell com- 
ponents) appear in data collected at 373 K and 300 K (Extended Data Fig. 2). The 
strongest of these is observed at 20 ~ 6.1° (indexed as 3/2 1/2 1/2 on the cubic 
sublattice) but additional peaks can also be observed at 20 values of 8.05° (3/2 3/2 
1/2), 12.1° (5/2 3/2 3/2) and 13.2° 20 (5/2 5/2 1/2) (d-spacings of about 2.11, 1.61, 
1.07 and 0.98 A, respectively). Assuming that CaSiO3's initial distortion upon cool- 
ing is to a tetragonal phase, there are three likely candidate structures, with space 
groups [4/mcm, P4/mbm and I4/mmm. The predicted superlattice peak positions 
for the structure with [4/mcm symmetry exactly match the observed superlattice 
peak positions, explaining all four observed peaks, while CaSiO3 structures with 
space groups P4/mbm or I4/mmm cannot account for the observations (Extended 
Data Fig. 2) as these should both produce many additional superlattice peaks (for 
example, those indexed as 1/2 0 3/2, 3/2 1/2 1, 1/2 0 5/2, 5/2 1 1/2 and so on, on the 
basis of the cubic subcell) that are not observed. Given that there are no additional 
unexplained diffraction peaks, the knowledge that the most obvious superlattice 
peak at 20 = 6.1° should indeed be the strongest in 14/mcm and that the same 
transition (Pm3m to I4/mcm) occurs”! in CaTiOs, we see no reason to doubt that 
the structure of CaSiO3 at room temperature and at about 12 GPa is tetragonal with 
space group 14/mcm. 

The observed behaviour of Ca[Sig ¢Tig,4]O3 is similar to that of CaSiO3, but the 
intensities of the superlattice peaks are much greater and the material is recovera- 
ble to ambient conditions. Diffraction patterns collected from the starting material 
Ca[Sio 6Tio 4]O3 (measured alongside LaBg as a calibrant at 300 K, Extended Data 
Fig. 3a) allow identification of superlattice peaks with odd and even Miller indices, 
which requires positive and negative octahedral tilts**. Thus, the highest possible 
symmetry of Ca[Sio ¢Tip 4]O3 at ambient conditions is orthorhombic. Refining the 
diffraction data in three likely space groups (Pbum, Cmcm and P4,/nmc) demon- 
strates that both Pbnm and Cmcm can explain the patterns equally well, and so it 
is concluded that ambient Ca[Sio ¢Tio.4]O3 is orthorhombic (actually monoclinic 


when B-cation ordered, see below). We choose to assume the space group is Pbnm, 
as this is the known”! structure of CaTiO3. Since no phase transition is observed 
during decompression, the room-temperature structure at high pressure is also 
assumed to be Pbnm, or its B-cation ordered equivalent (P2;/c). At high-PT con- 
ditions, Ca[Sio ¢Tip.4]O3 is observed to undergo two structural phase transitions 
(Extended Data Fig. 4). At the highest temperatures, the diffraction pattern is well 
explained if the sample takes a cubic Fm3m structure. With respect to the cubic 
form of CaSiOs, this has a double unit-cell edge length and partial B-cation order- 
ing (assumed to follow a 1:1 B-site scheme’) of Si and Ti, a common ordering 
scheme in perovskites. On cooling, the Ca[Sig ¢Tio.4]O3 first distorts to a tetragonal 
structure marked by the appearance of the same set of superlattice peaks that were 
observed for CaSiO3. Since the cubic Ti-bearing Ca-Pv is B-cation ordered, it is 
assumed the tetragonal phase maintains this B-cation ordering, making the space 
group of the tetragonal phase I4/m. Continued cooling sees distortion into the 
room-temperature phase, which is assigned to be monoclinic P2,/c (the B-cation 
ordered variant of Pbnm). Between the tetragonal and monoclinic structures there 
is a temperature interval where the diffraction pattern cannot be indexed using a 
single structure model, and it is observed that this temperature interval corre- 
sponds to very low acoustic velocities. This may be indicative of a first-order phase 
transition, which further implies a preference for Pbnm or P2,/c over Cmcm or 
C2/c, based on the analysis of Glazer**, Alternative explanations for this behaviour 
are a temperature interval of phase coexistence, an additional perovskite structure 
or some other unexplained phenomena. Similar observations have been made for 
CaTiOs (ref. *°), where there is a small temperature interval between the I4/mcm 
and Pbnm structures where the behaviour is attributed to an interval of phase 
coexistence caused by the kinetic energy barrier of the first-order phase transition. 
A more detailed discussion of the crystallographic behaviour of Ca-Pv samples lies 
beyond the scope of the current study. 
Ab initio calculations and the slope of the Ca-Pv tetragonal-cubic transition. 
Ab initio calculations were performed to constrain the conditions of the tetragonal- 
cubic transition of Ca-Pv throughout the Earth’s mantle. All simulations were car- 
ried out with the density-functional-theory (DFT) code VASP"! using the projec- 
tor-augmented-wave (PAW) method” and the PBE formulation of the generalized 
gradient approximation*’. Molecular dynamics (MD) calculations used the Nosé 
thermostat and were run at the gamma point with a cut-off of 600 eV and relaxed 
to within 10~* eV and all forces to below 0.03 eV A~!. All computational runs were 
at least 20 ps in duration, although all measured properties were observed to be 
fully converged by 12 ps. Phonon calculations for calculating the force-constant 
matrix used an energy cut-off of 850 eV, 4 x 4 x 4 K points and were relaxed to 
10-8 eV with forces below 0.01 eV A~!. The finite difference method was used 
and processing was done with the phonopy code“. Ca atom semicore 3s and 3p 
states were treated as valence states. All static and molecular dynamics runs were 
spin-polarized and CaSiO3 was always simulated with an 80-atom simulation box, 
based ona2 x 2 x 4 assemblage of the perovskite cubic aristotype cell containing 
5 atoms. For cubic perovskite calculations, the subcells from which the simulation 
box was constructed were fixed to have a geometry of a = b = c, whereas for tetrag- 
onal CaSiO; the simulation geometry for was fixed such that a = b ¥ c. Details of 
simulation cell volumes are provided in Supplementary Table 7. In each case the 
final stress on the crystal was correct to within 0.04 GPa for static calculations and 
to within 0.1 GPa (on average) for MD calculations. After the geometries were 
imposed at each PT condition the atoms were allowed to relax (within constraints 
maintaining cubic or tetragonal structure), before the simulations used to obtain 
free energies. 

Free energy differences were calculated at 25, 75 and 125 GPa and at 0, 1,000, 
2,000 and 3,000 K. To calculate the free energy of each state, we used an approxi- 
mation of the thermodynamic integration represented by equation (1): 


F-Ry& (U-U)g+ = ([U-Uy—(U-Up) 1), () 
B 


where subscript 0 represents the reference state and other terms the state of inter- 
est. For a reference state we used a harmonic oscillator with free energy defined 
by equation (2): 


2. 
P, 1 
U,=U+ ES so u.®..u: 2 
0 ims 2o aa al ( ) 


where u, is a displacement vector and #j is the force constant matrix. As cubic 
CaSiO; is unstable at low temperatures it has negative frequencies in its phonon 
spectrum, and the free energy cannot be calculated directly from its force constant 
matrix. Thus, we applied a small correction to eliminate these imaginary frequen- 
cies and allow free energy calculations. The correction matrix (@,") is formed by 
multiplying the onsite terms in the force constant matrix by kI (where I is the 
identity matrix and k is the smallest constant that eliminates all imaginary 
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frequencies). The correction procedure then subtracts the correction matrix from 
the true force constant matrix (®, — G;) to produce a modified force constant 
matrix ®,, which is used in equation (2). While this procedure cannot predict the 
correct absolute free energies, it should accurately calculate the free energy dif- 
ference relative to a reference state that has had the same correction procedure 
applied. This procedure is repeated for both cubic (cub) and tetragonal (tet) 
CaSiO3 structures at each PT condition, and finally the free energy difference 
between these states is determined using equation (3): 


AGoup tet = AG ina ret + Gar — (AG ina rer + Goat) (3) 


The final calculated free energy differences between cubic and tetragonal states 
are reported in Supplementary Table 7. All free energy differences are subject to 
uncertainties, which are assumed to be 1 meV per formula unit (f-u.) at 0 K, and 
calculated to be <10 meV per f.u. at high temperatures, from the statistical uncer- 
tainty in the simulation energies. These uncertainties were incorporated in the 
weighted least-squares regression used to determine the conditions and uncertainty 
of the tetragonal-cubic transformation at each pressure (25, 75 and 125 GPa), as 
plotted in Extended Data Fig. 5. 

Calcium perovskite EoS. PVT-velocity data from experiments in this, and litera- 
ture, studies were combined to allow fitting of an EoS for both tetragonal and cubic 
CaSiO;3 perovskite. The pressures of all reported literature data were converted to 
a single pressure scale, which was used for pressure measurement in this study*”. 
This ensures consistency of the pressure scale used throughout the dataset. The 
fitted EoS for tetragonal Ca-Pv uses only room-temperature data and all literature 
data collected on samples compressed in diamond anvil cells without use of a 
pressure-transmitting medium were discarded (small symbols in Extended Data 
Fig. 6a). The remaining PV data’, including one-twentieth of the data collected 
during room-temperature decompression in this study before amorphization, 
were fitted to a second-order Birch-Murnaghan EoS for Vo (volume at ambient 
conditions) and Kp (isothermal bulk modulus at ambient conditions) using the 
BurnMan software package!! (Supplementary Table 2). Only one-twentieth of the 
current data were used, to ensure that the final fitted model was not overly biased 
to the data collected in this study. Subsequently, room-temperature velocity meas- 
urements from this study were combined with literature data!””° to fit the shear 
modulus (Go) and its pressure derivative (Go’) within the SLB2005”? formalism, as 
provided in BurnMan. As the EoS is only calibrated at room temperature, estimates 
of velocity reductions for tetragonal Ca-Pv in Fig. 4 are qualitatively based on the 
magnitude of reductions observed in measurements from Ca[Sio.¢Tio.4]O3, taken 
~100 K below the phase transition, in this study, and are not calculated by using 
the tetragonal Ca-Pv EoS. 

The EoS for cubic Ca-Pv (Supplementary Table 2) also fits data from this 
study and the literature’ “* using the SLB2005”° Mie-Griineisen—Debye Birch- 
Murnaghan formalism implemented in BurnMan!'. Only data falling above the 
calculated P-T curve of the tetragonal-cubic phase transition (Extended Data 
Fig. 5) were used, to ensure that no data from tetragonal-structured Ca-Pv were 
included (Extended Data Fig. 6b). PVT data were first used to fit Vo, Ko, Ko’ (pres- 
sure derivative of the bulk modulus) and 7 (Griineisen parameter). Subsequently, 
the complete PVT-velocity dataset was re-fitted for Go, Yo, Jo (temperature depend- 
ence of the bulk moduli) and 6 (Debye temperature) (Vo, K7, Ko’ were unchanged) 
assuming Gp’ = 1.66. Go! was fixed to literature values*®”° due to the small pressure 
range of velocity measurements in this study, a value consistent with literature 
scaling rules”’. 7. (temperature dependence of the shear moduli) was fixed at 3.3, 
based on the scaling rules (7.0/0 © 2) from Stixrude”’. Alternatively, 7.9 can also 
bea fitted parameter (Supplementary Table 2). However, since this second fit (with 
variable 77,9) results in slightly lower extrapolated velocities without drastically 
altering subsequent interpretation, fixing 19 was viewed as a more conservative 
way to evaluate the influence of Ca-Pv. It is noted that the uncertainty bounds 
plotted in Fig. 1 account for variation of Go’ from 1.44 to 1.88 and Ca-Pv remains 
slower than PREM and ab initio estimates throughout this entire range. We rec- 
ognize that, without using literature data, extrapolation throughout the mantle 
pressure range would be completely unrealistic—and are beholden to accepting the 
reliability of literature data; however we note that four previous studies on Ca-Pv at 
high-PT conditions, after conversion to a common pressure scale, can be combined 
and fitted to a single EoS without any outliers at the 3c level. Readers are referred 
to Stixrude”® for details of the SLB2005 formalism. 

Thermodynamic modelling. The acoustic properties of MORB, peridotitic and 
harzburgitic assemblages have been calculated using the MMA-EoS software 
package”*. Simplified bulk compositions for MORB (NCFMAS) and pyrolite/ 
harzburgite (CFMAS) from the software's library were employed as typical of these 
assemblages throughout the lower mantle. Equilibrium phase assemblages were 
calculated across a 0.5 GPa by 25 K grid throughout the mantle for each system, and 
the elastic properties of each assemblage extracted along self-consistent adiabatic 
temperature profiles beginning at 1,000 K (representing slabs) and 1,500 K (average 
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mantle), which are plotted relative to one another in Fig. 4. The latter temperature 
profile is very similar to the geotherm of Brown and Shankland*’. Thermoelastic 
data from Stixrude® were used for all phases except for the MgSiO; bridgmanite 
endmember (which used updated properties from Zhang*!) and calcium perovskite 
which is defined in this study. We note that this database provides a somewhat 
simplified view of lower-mantle materials, as it does not include the effects of iron 
spin-transitions in ferropericlase or bridgmanite°*” or the ferroelastic phase tran- 
sitions of stishovite™*. We also highlight that the modelling in this study inherently 
assumes that the database from Stixrude et al.° accurately describes the elastic 
properties of all other lower-mantle phases. 

Comparison with previous studies. As noted in the main text, the acoustic veloc- 
ities of CaSiO; observed in this study are observed to be substantially slower than 
predicted in computational studies*> and mineralogical databases®. Additionally, 
they are also observed to be slower than those found in previous high-PT exper- 
iments”. Although we cannot fully explain the reasons for all disagreements, we 
discuss some observations that may partially explain the mismatches. Database 
elastic properties® predict the fastest Ca-Pv velocities plotted in Fig. 1, and it is 
these that the Earth science community currently uses when interpreting seismic 
observations. Results from the two ab initio molecular dynamics (AIMD) compu- 
tational studies*° and experiments” all predict that Ca-Pv should be equal to (vp) 
or slower (vs) than PREM. If any of these results were adopted in mineralogical 
databases, slow velocity anomalies in the lower mantle could be interpreted as they 
are in this study, an indicator of MORB enrichment, although not to the extent 
implied here. We note that the other pseudo-high-temperature calculations’ do 
not provide enough information in the paper to calculate acoustic properties at 
elevated temperatures, since the temperature effect on density is unquantified in 
the original publication. 

Comparing our work in detail, first with previous experimental results, it is 
observed that the room-temperature velocities measured in this study are in excel- 
lent agreement with those of Gréaux et al.”°, Kudo et al.!” and extrapolated esti- 
mates from Sinelnikov et al.!°. Room-temperature velocities measured by Li et al.” 
are somewhat faster, but given the lack of details provided in that paper, which is 
a technical review, they are not considered further. It is observed that our reported 
velocities disagree with previous experimental data only at high temperature”, 
appearing to diverge as the reported temperature increases. Given the similarities 
in methodology, it is most likely that temperature uncertainties are responsible for 
the differences. Based on published details, we believe Gréaux et al.?° employed 
samples of 0.93-1.3 mm length (Fig. 2 and Extended Data Fig. 3 of Gréaux et al.”°) 
and 2 mm diameter, with the thermocouple inserted radially (through the furnace) 
adjacent to the far end of the pressure marker that in their experiments is initially 
~1 mm in length. This arrangement is substantially larger than the samples used 
in this study, which were 0.4-0.6 mm in length and 1.5 mm diameter, with the 
thermocouple inserted axially to the end of the pressure marker that had a maxi- 
mum length of 0.5 mm. Thus, the maximum distance between the thermocouple 
and far end of the sample in our study is 1.1 mm, probably 0.75 mm at high pres- 
sure, approximately half of the equivalent distance in Gréaux et al.?° (probably 
>1.5 mm at pressure). Additionally, by inserting the thermocouple axially we 
ensure the sample is centred in the cell at high pressure, whereas it is unclear 
whether or not this would be the case with a radial thermocouple, which could 
also have been affected by contacting the metal furnace. Given that, at 12-22 GPa, 
the sample column is likely to be 4-5 mm in length, the differences in geometry 
might have a very large influence on the temperature conditions experienced by 
samples. Thermal modelling using finite element code” suggests that the thermal 
gradient across samples at a measured temperature of 1,200°C in our set-up should 
be <50-60°C, with the measured temperature likely to be lower than peak condi- 
tions. By contrast, assuming the thermocouple is centred (as drawn by Gréaux 
et al.”°) and measuring 1,200°C, the range of temperatures experienced by a sam- 
ple of 1 mm length x 2 mm diameter could very conceivably be 250-300°C lower 
than that measured by the thermocouple. This implies that the apparent effect of 
temperature on velocities should be smaller using the geometry of Gréaux et al.”°, 
since portions of samples would be colder than believed. This is consistent with 
the observed differences in velocities between Gréaux et al.”° and the present study, 
where the offset in reported velocities increases at higher temperatures. Additional 
evidence that the high-temperature velocities reported by Gréaux et al.”° might be 
less reliable is demonstrated by comparing the independent estimates of bulk sound 
velocity (vy) expected for Ca-Pv from PT-volume systematics and acoustic meas- 


vp 3% = ./K,/p, where Ks is a function of Koz, K’, yand a). 


We observe that the bulk sound velocity extracted from ultrasonic measurements 
in Gréaux et al.”° are inconsistent with velocity extracted via a PVT EoS using their 
diffraction data (Extended Data Fig. 9). Ultrasonic vy values from Gréaux et al.”° 
are offset to slower values and have a much larger reduction at high temperature 
than those predicted via an EoS fitted using density from their or compiled 
literature data**~“’. In contrast, bulk sound velocities from data in this study are 


urements (v, = 


consistent with literature PVT EoS fitting. This inconsistency suggests that veloc- 
ities reported in Gréaux et al.”” might be affected by large temperature gradients. 

Considering calculated properties of Ca-Pv, we observe that the database values* 
best reproduce the adiabatic bulk moduli of Ca-Pv (compared with that from the 
global experimental dataset, Extended Data Fig. 7b), while the two AIMD studies*? 
predict a larger pressure effect on Ks than is observed in the experimental data. 
Other calculations employing mean-field and Landau theory’ suggest that shear 
softening should be associated with the cubic-tetragonal transition, while AIMD 
approaches do not include this behaviour*>. However, it has been proposed‘ that 
the choice of cubic unit cell employed by Stixrude et al. prevented rotations of the 
SiOg octahedra, leading to an anomalously large shear modulus and explaining the 
high velocities. The AIMD results of Kawai and Tsuchiya‘ should be preferred to 
those from Li et al.5, as the latter may not have fully converged and insufficiently 
sampled the Brillouin zone’ to accurately predict crystal structure. Despite differ- 
ences, all three computational studies predict a larger shear modulus than required 
by experimental data (from both this and previous studies!®!7). It is possible 
that this discrepancy results from the strong anharmonicity of Ca-Pv, implying 
that extremely expensive calculations may be required to accurately describe 
Ca-Pv’s elasticity using computational methods. Indeed, common first-principles 
methods inaccurately predict the elasticity or phonon temperature dependence 
of other anharmonic cubic perovskites (SrTiO3, BaTiO3 and PbTiO3)°°>%. DFT 
calculations that under/overestimate the cubic lattice parameter consistently over/ 
underestimate the shear modulus in the opposite sense***. The local-density 
approximation, used by Kawai and Tsuchiya‘, has been observed to overestimate 
the shear modulus (c44) of SrTiO3, BaTiO; and PbTiO; by 8-18% for ~1% under- 
estimate of unit cell volume. Since we observe a similar mismatch between the 
volume of cubic CaSiO; at adiabatic conditions based on our fit to experimental 
data and the results of Kawai and Tsuchiya‘, which are ~1% too small, we expect 
that Ca-Pv’s velocities predicted by Kawai and Tsuchiya‘ will be somewhat over- 
estimated. However, it is unlikely this effect can explain the entirety of the disa- 
greement between previous calculations and our experimental results. A second 
contributor to the mismatch could be the presence of crystallographic preferred 
orientation (CPO) within experimental samples, especially if the alignment of 
an acoustically slow direction coincided with the ultrasonic path. However, since 
refinement of X-ray diffraction patterns did not require CPO in the cubic CaSiO3 
field to fit the data, this seems unlikely. Additionally, the way crystal symmetry is 
stipulated and the lack of grain boundaries/defects in calculations may frustrate 
some phonon modes, further explaining the offset from experimental values. 
Finally, we re-iterate that the finite-strain model we report in this paper is subject 
to very large extrapolation from the experimental PT conditions (~12 GPa, 
300-1,500 K) to those of the mantle (<130 GPa, 1,500-3,000 K) and we acknowl- 
edge additional experiments are now required to investigate in better detail the 
changes of Ca-Pv velocity at more extreme conditions. 
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Extended Data Fig. 1 | Lattice and diffraction peak parameters for uncertainties. e, Full-width at half-maximum (FWHM) of diffraction 
CaSiO; and Ca[Sio.¢Tio.4]O3 perovskite. a-d, Refined lattice parameters peaks (see key) of the CaSiO3 perovskite sample, normalized to the 
and pseudo-cubic unit cell volumes from Ca[Sio ¢Tio.4]O3 (a, c) and FWHM at high temperature, measured at 100 K intervals in a separate 


CaSiO; (b, d) plotted as a function of experimental temperature with 20 experiment to that in Fig. 2. 
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Extended Data Fig. 2 | X-ray diffraction patterns from CaSiO; are labelled with indices, hkl, in bold. The diffraction patterns reveal the 
perovskite. Shown are stacked diffraction patterns of CaSiO; perovskite; appearance of small superlattice reflections at T = 373 K and 300 K at 20 
each panel shows data at 300 K, 373 K and 473 K (see key in a). a, Full values of about 6.1°, 8.05°, 12.1° and 13.2° (we note there is believed to be 
patterns; b, c, patterns limited in the 20 range to allow indication of weak an additional superlattice reflection obscured at 20 = 10.5°) labelled with 
superlattice peaks. The positions of the diffraction peaks from the Ca-Pv hkl indexed on the tetragonal (J4/mcm) unit cell and marked with gold 
sample, MgO, NaCl and Au are indicated by markers—other small peaks stars. 


are from boron epoxy and/or furnace components. Cubic Ca-Pv peaks 


LETTER 


—— model 

—— residual 
« observed 
1 capv(P2;/c) 
1 LaBe 


Intensity (arbitrary units) 


— model 
—— residual 


Intensity (arbitrary units) 


—— model 

—— residual 
observed 
capvFm3m 
MgO 

Au 

NaCl 

TiBz 


I 
| 
! 
' 
1 


Intensity (arbitrary units) 


6 7 8 9 10 11 12 13 
20 
Extended Data Fig. 3 | Refined X-ray diffraction patterns from pressure (12 GPa). In each panel, the black dots are the collected data, the 
Ca[Sio.¢Tio.4]O3 perovskite. a—c, Rietveld refinements of Ca[Sip ¢Tio.4]O3 blue curve the model pattern and the green curve the residual. The 
samples: a, in P2;/c with LaBg calibrant, at 300 K and ambient pressure; coloured tick-marks indicate the positions of diffraction peaks of each 
b, in the tetragonal [4/m structure (with other cell components) at 890 K phase. 


and high pressure (about 12 GPa); and ¢, in Fm3m at 1,336 K and high 
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Extended Data Fig. 4 | X-ray diffraction patterns from Ca[Sio¢Tio4]O3 620 and 444 diffraction peaks (b-h, respectively; indexed using a cubic 


perovskite. a, Complete diffraction pattern of the Ca[Sio¢Tig.4]O3 lattice with a ~ 7.3 A), demonstrating the change in thermal expansivity 
sample as a function of temperature at about 12 GPa, with diffraction between cubic and tetragonal/monoclinic structures, and allowing visual 
intensity indicted by colour scaling. b-h, Magnified panels froma identification of the observed phase transitions. 
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the mantle from ab initio simulations and experiments. Shown is the 
cubic-tetragonal transition extrapolated throughout the mantle based 
on ab initio (solid circles) and experimental (triangles) constraints from 


plotted as red curves, with dashed red arrows indicating the warming 
occurring during slab stagnation at 700-1,000 km depth. Results from 
previous experimental'*”? and computational’ studies are plotted as open 


this study. Vertical error bars (10) and the grey envelope (80% confidence symbols and grey curves, respectively. 
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press, which can be subject to larger uncertainties in volume. Error bars compared with the best-fit model, demonstrating the lack of outliers***. 
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and schematic of the experimental cell design. a, Example of synchrotron ona CaSiO3 sample. c, Cross-section of the experimental assembly (to 
radiographic image in plan view used to measure sample length in a scale) used in ultrasonic experiments throughout this study. 
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Climate change and overfishing increase 
neurotoxicant in marine predators 
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More than three billion people rely on seafood for nutrition. 
However, fish are the predominant source of human exposure to 
methylmercury (MeHg), a potent neurotoxic substance. In the 
United States, 82% of population-wide exposure to MeHg is from 
the consumption of marine seafood and almost 40% is from fresh 
and canned tuna alone’. Around 80% of the inorganic mercury (Hg) 
that is emitted to the atmosphere from natural and human sources is 
deposited in the ocean”, where some is converted by microorganisms 
to MeHg. In predatory fish, environmental MeHg concentrations 
are amplified by a million times or more. Human exposure to 
MeHg has been associated with long-term neurocognitive deficits 
in children that persist into adulthood, with global costs to society 
that exceed US$20 billion’. The first global treaty on reductions 
in anthropogenic Hg emissions (the Minamata Convention on 
Mercury) entered into force in 2017. However, effects of ongoing 
changes in marine ecosystems on bioaccumulation of MeHg in 
marine predators that are frequently consumed by humans (for 
example, tuna, cod and swordfish) have not been considered when 
setting global policy targets. Here we use more than 30 years of 
data and ecosystem modelling to show that MeHg concentrations 
in Atlantic cod (Gadus morhua) increased by up to 23% between the 
1970s and 2000s as a result of dietary shifts initiated by overfishing. 
Our model also predicts an estimated 56% increase in tissue MeHg 
concentrations in Atlantic bluefin tuna (Thunnus thynnus) due to 
increases in seawater temperature between a low point in 1969 and 
recent peak levels—which is consistent with 2017 observations. 
This estimated increase in tissue MeHg exceeds the modelled 22% 
reduction that was achieved in the late 1990s and 2000s as a result 
of decreased seawater MeHg concentrations. The recently reported 
plateau in global anthropogenic Hg emissions‘ suggests that ocean 
warming and fisheries management programmes will be major 
drivers of future MeHg concentrations in marine predators. 

The exploitation of fisheries in the northwestern Atlantic Ocean for 
hundreds of years has led to large fluctuations in herring, lobster and 
cod stocks, which has altered the structure of food webs and the avail- 
ability of prey for remaining species. We synthesized more than three 
decades of ecosystem data and MeHg concentrations in seawater, sed- 
iment and biological species that represent five trophic levels from the 
Gulf of Maine, a marginal sea in the northwestern Atlantic Ocean that 
has been exploited for commercial fisheries for more than 200 years. 
These data were used to develop and evaluate a mechanistic model for 
MeHg bioaccumulation that is based on bioenergetics and predator- 
prey interactions (see Methods), to better understand the effects of 
ecosystem changes and overfishing®. 

A comparison of simulated MeHg concentrations based on extensive 
analysis of the stomach contents of two marine predators (Atlantic cod 
and spiny dogfish, Squalus acanthias) in the 1970s and 2000s reveals 
that the effects of shifts in trophic structures caused by overfishing 
differed between these two species (Fig. la, b). In the 1970s, cod con- 
sumed 8% more small clupeids than in the 2000s as a consequence 


of the overharvesting and reduced abundance of herring’. Simulated 
tissue MeHg concentrations in cod (larger than 10 kg) in the 1970s were 
6-20% lower than for cod consuming a diet typical of the 2000s that 
relied more heavily on larger herring, lobster and other macroinver- 
tebrates’. The 1970s diet for spiny dogfish when herring were limited 
included a higher proportion (around 20%) of squid and other ceph- 
alopods, which exhibit higher MeHg concentrations than many other 
prey fish. In contrast to cod, simulated MeHg concentrations in spiny 
dogfish were 33-61% higher in the 1970s than in the 2000s, when they 
consumed more herring and other clupeids’. These results illustrate 
that perturbations to the trophic structure of marine organisms from 
overfishing can have contrasting effects on MeHg concentrations across 
species. Such changes must therefore be assessed before concluding 
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Fig. 1 | Modelled effects of ecosystem change on MeHg concentrations 


in Atlantic cod and spiny dogfish. a, b, Differences in modelled 

MeHg concentrations in Atlantic cod (a) and spiny dogfish (b) based 
on a diet typical of the 1970s (dotted line) and the 2000s (solid line). 
Prey preferences for each time period were derived from the stomach 
contents of more than 2,000 fish’. c, d, Modelled changes in fish MeHg 
concentrations (relative to a diet typical of the 2000s) that result from 

a temperature increase of 1 °C; a shift in diet composition driven by 
overfishing of herring (represented by 1970s prey preferences when this 
last occurred); an assumed 20% decline in seawater MeHg concentration; 
the combination of both an increase in temperature and a decrease in 
seawater MeHg; and the simultaneous combination of all three factors. 
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Fig. 2 | Effects of seawater warming in the Gulf of Maine on tissue MeHg 
concentrations in ABFT. a, Modelled seawater MeHg concentrations 

over time. The model is based on measured MeHg concentrations 

between 2008 and 2010!” and scaled by modelled temporal changes in 
seawater Hg!*. b, Measured temperature anomaly in seawater in the Gulf 
of Maine’. The shaded grey area indicates the projected future change. 

c, Modelled MeHg tissue concentrations in 14-year-old ABFT based on 
changes in seawater MeHg concentrations (dashed line), and based on the 


that temporal trends in biological MeHg concentrations reflect shifts 
in environmental Hg contamination. 

Northward migration of the Gulf Stream and decadal oscillations in 
ocean circulation have led to unprecedented seawater warming in the 
Gulf of Maine between a low point in 1969 and 2015, which places this 
region in the top 1% of documented seawater temperature anomalies’. 
Both laboratory and field mesocosm data have demonstrated that rising 
temperatures lead to increases in MeHg concentrations in estuarine 
and freshwater fish®"°, but the magnitudes of potential changes in wild 
species are poorly understood. The effects of seawater warming are 
complicated by the narrow temperature niches of many marine fish 
species, which we account for in our food web model (see Methods). 
Seawater warming of greater than 1-2°C can lead to shifts in preferred 
foraging territory to higher latitudes or deeper in the water column, 
which alters the availability of prey for remaining species'?. 

The effects of ecosystem changes on MeHg bioaccumulation vary 
across species and are not additive for predatory fish because of feeding 
relationships and bioenergetics at lower trophic levels. We modelled 
the changes in MeHg tissue concentrations in Atlantic cod and spiny 
dogfish that would result from increases in seawater temperature, 
declines in seawater MeHg concentrations and shifts in trophic struc- 
ture due to overfishing (Fig. 1c, d). Experimental data indicate that 
MeHg uptake by most marine algae is not sensitive to variability in 
seawater temperature® and therefore our modelling analysis accounts 
for temperature-driven changes in MeHg at higher trophic levels, from 
zooplankton to predatory fish. 

For a 15-kg Atlantic cod, our model predicts that an increase of 1°C 
in seawater temperature relative to the year 2000 would lead to a 32% 
increase in simulated tissue MeHg concentrations. A shift in trophic 
structure characteristic of overexploited herring fisheries would result 
in a 12% decrease in fish MeHg. In the absence of ecosystem changes, 
simulated fish MeHg concentrations shift proportionally to seawater 
MeHg concentrations. If we assume that seawater MeHg concentra- 
tions decline by approximately 20% as a consequence of reductions 
in Hg loading, the combination of all three factors simultaneously 
results in a 10% decrease in tissue MeHg concentrations for Atlantic 
cod (Fig. 1c). 

For a 5-kg spiny dogfish, our model estimates that a tempera- 
ture increase of 1 °C would result in a 70% increase in tissue MeHg 


LETTER 


c ABFT (Thunnus thynnus) 
1,250 5 ; 

_~ .-- Change in seawater MeHg 

o — Change in seawater MeHg and temperature 

2 14,000 4 | 

D 7 

3 ‘en 

= 750 \ + 

2 ont 

2 ° ae 

500 ; 
A 304 

= Temperature-driven change 

& 

D 

= 104 

o 

= y ey Me 

® -104 / , 

aD 

= 

@ 

5 

-30 T T T 1 
1950 1970 1990 2010 2030 
Year ABFT captured 


combined effect of changes in seawater MeHg concentrations and seawater 
temperature anomaly (solid line). The symbols indicate means of observed 
concentrations in multiple fish: new data for ABFT that were captured 

in 2017 (n = 33) are shown as a star; previously published data!®!*° are 
shown as crosses!© (n = 83), a square!’ (n = 14), a triangle’? (n = 3) and 

a circle”° (n = 5). Sample size (1) represents the number of independent 
fish; s.d. and statistics are provided in Extended Data Table 3. d, Changes 
in MeHg concentrations in ABFT that are due to temperature only. 


concentrations, and that switching to a diet that is characteristic 
of low herring abundance would lead to a 50% increase in fish MeHg. 
When combined with the assumed 20% decline in seawater 
MeHg concentrations, the model predicts a 70% increase in tissue 
MeHg concentrations for dogfish (Fig. 1d). Owing to a large reduction 
in Hg releases from wastewater and declines in atmospheric deposi- 
tion of Hg in North America’'3, seawater MeHg concentrations in 
the northwestern Atlantic Ocean are presumed to have declined since 
the 1970s (Fig. 2a). Our results help to explain why temporal changes 
in tissue MeHg concentrations in the Gulf of Maine have been mixed 
across species, despite declining inputs of Hg to the marine environ- 
ment since the 1970s’. 

We used historical temperature records to further investigate the 
effects of recent temperature changes on MeHg bioaccumulation in 
Atlantic bluefin tuna (ABFT), another important marine predator 
(Fig. 2). No time-series data on seawater MeHg are available, so we 
extrapolated measured concentrations using information on emissions 
in North America and projected total Hg concentrations in seawater 
(see Methods). Increases in seawater temperature coincide with puta- 
tive declines in seawater MeHg concentrations (Fig. 2b). 

The implications of changes in seawater MeHg concentrations 
(Fig. 2a) and seawater temperature (Fig. 2b) in the Gulf of Maine for 
tissue MeHg concentrations in 14-year-old ABFT (250 + 23 cm 
length'* (mean + s.d.)) are illustrated in Fig. 2. The dashed line in 
Fig. 2c shows the changes in MeHg in ABFT tissue that result from 
changing seawater MeHg only, and the solid line shows the combined 
influence of changes in seawater MeHg and temperature. Without 
including the effects of temperature, shifts in MeHg concentrations 
in ABFT lag peak seawater MeHg concentrations by five years, and 
the amplitude of the peak is dampened relative to seawater (Fig. 2c, 
dashed line). Historical temperature oscillations result in an additional 
lag of six years in maximum MeHg concentrations in ABFT (Fig. 2c, 
solid line), and reduce the standard error of the modelled tissue MeHg 
concentrations in ABFT compared to observations (Fig. 2c, symbols) 
from 120 ng g~! (Fig. 2c, dashed line) to 95 ng g~' (Fig. 2c, solid line). 

Both the model and observations indicate that a large decline in 
MeHg concentrations in ABFT occurred after the late 1980s and early 
1990s (Fig. 2c). The modelled decrease from peak to low concentrations 
is equivalent to a 23% decline in tissue MeHg concentrations (Fig. 2c). 
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Observed concentrations in 14-year-old ABFT from the Gulf of Maine 
show a 31% decrease between 1990 and 2012. Our model results sug- 
gest that 25-40% of tissue MeHg decreases in the 1990s are attributable 
to temperature decreases over this decade (Fig. 2d). 

Modelled effects of continued warming in the Gulf of Maine 
suggest a reversal of previous declines, and projected increases of 
almost 30% in 2015 that are sustained into 2030 (Fig. 2d). Between 
2012 and 2017, observations are consistent with model trends and show 
a statistically significant increase in MeHg (Fig. 2) of more than 3.5% 
per year in ABFT (one-way ANOVA, P < 0.05). These results illustrate 
the large effects on bioaccumulative toxicants in marine food webs that 
are expected as a result of climate-driven changes in marine ecosystems. 

Global anthropogenic emissions of Hg have been relatively stable 
since approximately 2011+. In North America and Europe, aggressive 
Hg regulations that began in the 1970s have successfully reduced or 
phased out most large Hg sources, and global emissions are now driving 
atmospheric Hg trajectories in the Northern Hemisphere. This means 
that future changes in tissue concentrations of MeHg in pelagic marine 
predators such as ABFT and Atlantic cod in the Gulf of Maine will be 
strongly influenced by further shifts in seawater temperature and prey 
availability. Biotic MeHg concentrations in other marine regions are 
likely to be similarly affected by widespread shifts in trophic interactions 
and seawater temperature. A two-pronged regulatory effort that involves 
reductions in the emissions of both greenhouse gases and Hg is therefore 
needed to reduce MeHg concentrations in pelagic predators. Notably, 
regulations that aim to reduce air pollution caused by carbon-intensive 
fuel sources (such as coal-fired utilities) also have the co-benefit of bring- 
ing about large reductions in anthropogenic Hg releases!°. 

Atmospheric Hg concentrations in the Northern Hemisphere 
declined by approximately 30% between the mid-1990s and 2000s, as 
a result of successful reductions in emissions from coal-fired utilities, 
industry and waste incinerators, and the phasing out of Hg in many 
commercial products in the United States and Europe’*. Previous studies 
have suggested that these and other regulations have led to corre- 
sponding declines in tissue Hg concentrations in ABFT and bluefish 
(Pomatomus saltatrix) in the Atlantic Ocean!*"°, Despite these benefits, 
recent regulatory proposals in the United States threaten to overturn 
rules that regulate mercury releases from coal-fired utilities and pro- 
posals to curb carbon emissions. Climate change is likely to exacer- 
bate human exposure to MeHg through marine fish, suggesting that 
stronger rather than weaker regulations are needed to protect ecosystem 
and human health. 
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METHODS 


Mercury concentration data in fish. Many studies report total Hg rather than 
MeHg in fish tissue. Extensive data on total Hg and MeHg concentrations in 
pelagic, demersal and benthic food webs of the Gulf of Maine were collected 
between 2000 and 2002!°. We used the measured MeHg fraction (90%) to scale 
total Hg values for ABFT. For lower trophic levels with variable MeHg concentra- 
tions we relied on direct MeHg measurements. Size-fractionated phytoplankton 
and zooplankton samples were obtained on research cruises and zooplankton spe- 
cies were identified and separated by a plankton ecologist. These data are shown 
in Extended Data Table 1. Fish and shellfish data are summarized in Extended 
Data Table 2. Trophic levels were determined from stable nitrogen isotopes (6!°N) 
measured in the same samples. 

Mercury concentrations in apex predators were compiled from several sources. 
A previous study”! reported total Hg in swordfish (Xiphias gladius) from the west- 
ern Atlantic Ocean (n = 192) with corresponding weights. Another research team 
measured total Hg in n = 1,279 ABFT harvested from the Gulf of Maine’®. Length 
(cm) and body weights (kg) were available for all tuna and used to estimate age, 
which ranged from 9 to 14 years. Data from this study!® were converted from 
dressed weight to whole weight by multiplying dressed weight by 1.25. 

Temporal data on MeHg concentrations in ABFT harvested from the Gulf 
of Maine were compiled from several sources, for fish lengths (250 + 23 cm 
(mean + s.d.)) and ages that correspond to approximately 14-year-old fish 
(Extended Data Table 3). For 1971 (n = 5)”° and 2002 (n = 3)!, 14-year old fish 
were identified based on reported length. For 1990, reported fish ages (n = 14) 
ranged between 8 and 15 years!®. For 2004-2012, MeHg concentrations in 14-year- 
old ABFT harvested from the Gulf of Maine were reported in a comprehensive 
study'®. ABFT tissue from individual fish harvested in 2017 from the Gulf of Maine 
were analysed for Hg in this study and are reported in Extended Data Table 3. 
Food web bioaccumulation model. Measured MeHg concentrations in the north- 
western Atlantic Ocean (Extended Data Fig. 1a) show characteristic increases 
across more than five trophic levels (derived from 6!°N)!°. However, MeHg con- 
centrations in swordfish and ABFT are underpredicted by the linear relationship 
between log[MeHg] and 6!°N. The slope of this relationship is known as the trophic 
magnification slope, and this parameter has been used to assess global patterns in 
biomagnification of MeHg in freshwater ecosystems”. However, the factors that 
govern variability in trophic magnification slopes across ecosystems are poorly 
understood, and their application to marine ecosystems is further complicated by 
potential shifts in baseline 6!°N for migratory species such as ABFT and sword- 
fish’®. We therefore developed a new mechanistic model for biomagnification of 
MeHg in marine food webs as a function of ecosystem properties®. 

We parameterized the mechanistic model for MeHg bioaccumulation to the 
food web that was characteristic of the Gulf of Maine in the early 2000s (Extended 
Data Fig. 2), and evaluated predicted tissue MeHg concentrations against meas- 
urements compiled previously for that period’®. We then applied the evaluated 
model to simulate the effects of measured temperature anomalies and documented 
shifts in trophic structure on MeHg concentrations in predatory fish. The model 
can be run deterministically, using the central estimate of all parameter values, or 
stochastically, to capture variability in seawater MeHg, dissolved organic carbon 
(DOC), prey consumption and other parameters. 

The food web model includes three size classes for phytoplankton (picoplankton 
(0.2-2.0 xm), nanoplankton (2-20 jum) and microplankton (20-200 j1m)); small 
(herbivorous) and large (omnivorous) zooplankton; macroinvertebrates; and fish. 
The lower (plankton) food web model has been described in detail previously’. 
In brief, our model simulates changes in MeHg uptake by phytoplankton due to 
varying seawater MeHg concentrations, differences in the composition of phyto- 
plankton communities and varying DOC concentrations. The relative abundance 
of different size classes of phytoplankton is based on empirical relationships with 
surface concentrations of chlorophyll a°. Monthly average concentrations of chlo- 
rophyll a for the Gulf of Maine were derived from measurements collected at eight 
stations between 1997 and 2001°. 

Phytoplankton MeHg concentrations are modelled based on passive uptake of 
MeHg from seawater (driven by cell surface-to-volume ratios and DOC concen- 
trations), because experimental data show that MeHg uptake by most phytoplank- 
ton species is not sensitive to seawater temperature®. This parameterization has 
previously been used to explain phytoplankton MeHg concentrations across 
a range of ecosystems in the northwest Atlantic’. DOC concentrations meas- 
ured in the Gulf of Maine (n = 82) are log-normally distributed (81 + 15 1M 
(mean + s.d.))®. Seawater MeHg concentrations are based on previous measure- 
ments?’ in the upper 60 m of the water column in the Gulf of Maine. Measured 
MeHg concentrations ranged between 0.015 and 0.055 pM and an average of 7% 
of the total Hg was present as MeHg. Sediment MeHg concentrations are based 
on those reported previously’? in integrated 15-20-cm grab samples of surface 
sediment (n = 95) from the Gulf of Maine that were collected between 2000 and 
2002 (0.44 + 0.32 pmol g-! (mean + s.d.)). 
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Time-dependent simulations for ABFT are based on measured MeHg concen- 

trations in seawater!” between 2008 and 2010, scaled by the trajectory in total Hg 
concentrations in the surface ocean between 1950 and 2030. Total Hg concentra- 
tions in the North Atlantic surface ocean were modelled using historical data on 
atmospheric Hg emissions” and a global geochemical model with resolved ocean 
basins”*?°. The annual concentrations (in pM) of MeHg in seawater that were used 
to force the time-dependent bioaccumulation simulation are shown in Extended 
Data Table 4. We used records of sea surface temperature (Extended Data Table 5) 
for the Gulf of Maine from 1950 to 2015° to simulate temperature-driven changes 
in MeHg in ABFT (Extended Data Table 6). 
Evaluation and sensitivity analysis of the food web model. A comparison of mod- 
elled and observed MeHg concentrations in ABFT as a function of size revealed 
that measurements were substantially underestimated (n = 1,195 observations, 3% 
within the 67% model confidence interval) when standard bioenergetics algorithms 
for energy expenditure, prey consumption and growth were used (Extended Data 
Fig. 1b, dashed line). Most bioaccumulation models assume that fish activity levels 
are constant”®. This results in a decreasing proportion of energy that is expended 
for respiration as fish weight increases. By contrast, migratory distance and energy 
expenditures for pelagic marine fish increase as they grow and swim more rap- 
idly’””8, Wild activity, particularly for migratory fish, is difficult to measure and 
thus rarely incorporated into estimates of consumption rates. Accurate consump- 
tion rates for fish in the wild are needed to model bioaccumulative contaminants 
such as MeHg. To account for these factors, we used swimming speed-, mass- and 
species-dependent activity multipliers (see Supplementary Information). 

Increasing the migratory energy expenditure of ABFT on the basis of established 
relationships with body size and swimming speed results in a shift in the expected 
mean of the model to match the central tendency of observations (Extended Data 
Fig. 1b, solid line). After accounting for migratory energy expenditure, the 95% 
confidence interval of probabilistically simulated MeHg concentrations in ABFT 
captures 90% of the observations. The probabilistic simulation includes distribu- 
tions for variable seawater MeHg, DOC, MeHg assimilation efficiencies and prey 
selection (Extended Data Table 7, Supplementary Information). Electronic tagging 
data show that western ABFT and swordfish spend a large fraction of their lifespan 
in shallow waters (<200-m depth) near the eastern coastline of North America”””, 
where measured MeHg concentrations!”*! range from 0.03 to 0.06 pM. The mod- 
elled upper and lower bounds for MeHg and DOC concentrations measured in the 
northwestern Atlantic Ocean capture 99% of the observed MeHg concentrations in 
ABFT. These results indicate a good model performance for ABFT when migratory 
energy expenditure is included. 

Prey consumption by most species is restricted by their body size— 
specifically, by the width of their mouth gape. This constrains the predator- 
to-prey length ratio to approximately 9:1, which we use in our standard 
model*”. For swordfish, observed MeHg concentrations (n = 156)?! are 
underpredicted by both the standard bioenergetics model (Extended Data 
Fig. 1c, dashed line) and the model adjusted for increased migratory energy 
expenditure (Extended Data Fig. 1c, dotted line). Only 5% of observations fall 
within the 67% model confidence interval. 

Swordfish are known to slash and knock out prey of a larger size than that 
predicted by their mouth-gape width**. The primary prey for swordfish at 
maturity are cephalopods, which catch larger prey using their tentacles and are 
thus also less constrained by body size. Better agreement between modelled 
MeHg concentrations and observations is achieved by adjusting allowable 
predator-to-prey length ratios*»*? to account for the larger prey sizes consumed 
by swordfish and cephalopods (Extended Data Fig. 1c, solid line). Model results 
show that 29% of the observations fall within the 67% confidence interval of the 
probabilistic simulation (orange shaded region in Extended Data Fig. 1c; 57% 
within the 95% model confidence interval). Simulating the upper and lower 
envelope of predator-to-prey length ratios (ratios from 10:1 to 2:1; yellow region 
in Extended Data Fig. 1c) captures 98% of the observations. Following these 
adjustments for apex predators, our results indicate excellent performance 
(R? = 0.92) of the bioenergetics model for MeHg bioaccumulation® compared 
to observations’? across five trophic levels in the Gulf of Maine food web 
(Extended Data Fig. 1d). 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 
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Extended Data Fig. 1 | Comparison of observed and modelled MeHg 
concentrations from a marine food web in the Gulf of Maine. 

a, Measured MeHg concentrations in biota and trophic positions based 
on nitrogen isotopes’®. b, Measured (symbols) MeHg concentrations!*!® 
in ABFT from the Gulf of Maine compared to modelled concentrations 
based on standard bioenergetics algorithms (dashed line) and based on 
bioenergetics algorithms adjusted for the energy consumption that is 
associated with migration and rapid swimming speeds (solid line). The 
blue shaded region shows the 67% confidence interval around the model 
and the grey shaded region represents the upper and lower bounds of 
modelled seawater MeHg and DOC concentrations. Each data point 
represents an individual fish; n = 1,284. c, Measured (symbols) MeHg 
concentrations”! in swordfish and modelled MeHg concentrations based 
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on standard bioenergetics algorithms (dashed line); algorithms adjusted 
for migratory energy expenditure and swimming speed (dotted line); and 
algorithms adjusted for energy expenditure and large prey consumption 
(solid line). The yellow shaded region indicates the upper and lower 
bounds of predator-to-prey length ratios (10:1 to 2:1), the orange shaded 
region shows the 67% confidence interval around the model and the 

grey shaded region represents the upper and lower bounds of modelled 
seawater MeHg and DOC. Each data point represents an individual fish; 
n = 203. d, Comparison of observed and modelled MeHg concentrations 
for the Gulf of Maine food web across five trophic levels!’. The model 

is forced by seawater MeHg concentrations’’ ranging from 0.015 to 

0.055 pM. Each data point represents the mean MeHg concentration in 
fish of a similar weight (m = 119); error bars represent s.d. 
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Extended Data Fig. 2 | Feeding relationships in the Gulf of Maine marine food web. Trophic interactions for the Gulf of Maine food web that are 
included in our MeHg bioaccumulation model. 
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Extended Data Table 1 | MeHg in Gulf of Maine plankton 


Category Dominant Size Mass MeHg Trophic 
[mm] Ig] Ing g*] Level* 
Microplankton Pleurosigma sp. 0.025-0.063 6.5x10% 0.1340.10 1.0 
Thalassionema nitschioides 
Streptotheca sp. 0.063-0.12 1.1x10* 0.17+0.13 1.0 
Rhizosolenia alata 0.12-0.25 8.2x10% 0.3240.16 2.6 
Oithona sp. 
Mesoplankton Calanus sp. 0.25-0.50 6.5x10* 0.48+0.32 2.8 
Calanus finmarchicus (copepodites 0.5-1.0 5.2x104 0.58+0.19 2.7 
Macroplankton Calanus finmarchicus 1.0-2.0 4.2x10° 1.4+0.72 3.1 
(adults) 
Calanus hyperboreus 2.0-4.0 3.4x102 2.041.2 3.4 
Nekton Meganyctiphanes norvegica 4.0-8.0 0.27 4.341.3 3.4 
Meganyctiphanes norvegica 8.0-16 2.1 5.51.9 3.4 


———__Pasiphaea _multidentata rt A A 
MeHg concentrations (ng g-!; mean + s.d.) measured in Gulf of Maine plankton between 2000 and 2002!°. Plankton were collected yearly with 3-5 tows and sieved for size fractions. Macroplankton, 


ichthyoplankton and nekton were collected yearly with 4-5 Vass—Tucker trawls!9. 
*Trophic level (TL) was estimated previously!9 as TL = 1 + (8!5N + 0.03)/3.4. 
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Extended Data Table 2 | MeHg in Gulf of Maine fish and shellfish 


Category 


Invertebrates 


Benthic 


Benthopelagic 


Pelagic 


Species 


Blue mussel 
Sea scallop 


American lobster 
Winter flounder 
Yellowtail flounder 
Pollock 


Haddock 

Cod 

White hake 
Cunner 

Spiny dogfish 
Atlantic Herring 
Atlantic Mackerel 
Swordfish 
Atlantic Bluefin 
Tuna 

Thresher Shark 


Mytilus edulis 
Placopecten magellanicus 


Homarus americanus 
Pseudopleuronectes 
Limanda ferruginea 
Pollachius virens 


Melanogrammus anglefinus 
Gadus morhua 

Urophycis tenuis 
Tautogolabrus adspersus 
Squalus acanthias 

Clupea harengus 

Scomber scombrus 

Xiphias gladius 

Thunnus thynnus 


Alopias vulpinus 


Mass 
[kg] 


0.005+0.001 
0.04+0.03 


0.34+0.03 
0.22 

0.55+0.17 
0.41+40.48 


0.4640.28 
1.541.2 
1.140.40 
0.12+0.02 
1.440.80 
0.15+0.050 
0.160.080 
100430 
340430 


560 


MeHg concentrations (ng g-!; mean + s.d.) measured in Gulf of Maine fish and shellfish between 2000 and 200219. 
*Trophic level (TL) was estimated previously’? as TL = 1 + (6!5N + 0.03)/3.4. 


MeHg 
Ing g*'] 


5.341.1 
6.91.5 


28411 

1547.8 
2348.7 
1545.4 


18411 
27414 
24410 
75428 
84427 
40425 
1745.3 
5804240 
7104140 


1800 


Trophic 
Level* 
3.1 

3.1 


46 
44 
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Extended Data Table 3 | MeHg in Gulf of Maine ABFT 


Capture Mean MeHg+ Standard deviation Pairwise Sample size (n) Data source 
ear ng g* wet weight Comparison 
1971 593 144 a 5 8 
1990 992 144 f 14 18 
2002 637 96.1 ac 3 7 
2004 822 254 de 15 5 
2005 652 147 ac 9 5 
2006 815 199 de 10 5 
2007 792 182 de 7 15 
2008 755 198 bed 8 5 
2009 580 - a 1 5 
2010 777 214 cde 9 15 
2011 614 195 ab 12 15 
2012 689 123 ad 13 5 
2017 809 277 ef 33 This study§ 


MeHg concentrations (ng g~!; mean + s.d.) measured in ABFT that were captured in the Gulf of Maine between 1971 and 2017. All fish are approximately 14 years old or have a curved fork length of 
250 + 23 cm!4 (mean + s.d.). 

tMeHg was estimated as 90% of total Hg on the basis of previously published measurements!°. 

+A one-way ANOVA was used for each ABFT cohort to investigate the statistical significance of differences in fish MeHg concentrations between years (P < 0.05). Common letters indicate groups with 
no significant difference in between-group comparisons (post hoc analysis using Tukey’s test). Statistical analysis was performed in R (v.3.4.3). 

8Dorsal or cranial muscle tissue from ABFT that were captured in the Gulf of Maine in 2017 was freeze-dried and homogenized. Total Hg content was measured using a Nippon MA-3000 direct thermal 
decomposition Hg analyser at Harvard University. Average recoveries of certified reference materials were 104.0 + 0.6% (mean + s.d.) (TORT, n = 5) and 102.2 + 1.7% (mean + s.d.) (DORM-4, n = 8). 
The relative precision of duplicate samples (RPD) was 3.1%. 
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Extended Data Table 4 | Modelled changes in seawater MeHg in the Gulf of Maine 


“Year —~=~—~—S«MeHg(pM)~—~«CWear=—S—=<C*é‘«‘ Hg (pM)~=~*C«eatr=—=S=~=*=<i*é‘~ MeHg (pM)~O~*«CSer=—=—SS~*~<Ct*é‘~Hg (pM) 
1950 0.049 1970 0.061 7990 0.052 2010" 0.044 
1951 0.049 1971 0.062 1991 0.052 2011 0.044 
1952 0.048 1972 0.063 1992 0.051 2012 0.044 
1953 0.048, 1973 0.063 1993 0.050 2013 0.044 
1954 0.048 1974 0.064 1994 0.050 2014 0.045 
1955 0.048, 1975 0.064 1995 0.049 2015 0.045 
1956 0.048 1976 0.063 1996 0.049 2016 0.045 
1957 0.048, 1977 0.063 1997 0.048 2017 0.045 
1958 0.049 1978 0.062 1998 0.048 2018 0.046 
1959 0.049 1979 0.061 1999 0.047 2019 0.046 
1960 0.050 1980 0.060 2000 0.047 2020 0.046 
1961 0.051 1981 0.059 2001 0.046 2021 0.046 
1962 0.052 1982 0.058 2002 0.046 2022 0.047 
1963 0.053 1983 0.057 2003 0.045 2023 0.047 
1964 0.054 1984 0.057 2004 0.045 2024 0.047 
1965 0.055 1985 0.056 2005 0.045 2025 0.047 
1966 0.056 1986 0.055 2006 0.045 2026 0.047 
1967 0.057 1987 0.054 2007 0.044 2027 0.048 
1968 0.059 1988 0.054 2008* 0.044 2028 0.048 
1969 0.060 1989 0.053 2009* 0.044 2029 0.048 


Seawater MeHg concentrations that were used to force modelled changes in MeHg tissue concentrations in ABFT. 
*Time series is based on average measured concentrations?” in the upper 60 m of the Gulf of Maine between 2008 and 2010 (grey shaded cells) and scaled by the trajectory in modelled total Hg in 
seawater between 1950 and 2030745, 
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Extended Data Table 5 | Changes in seawater temperature in the Gulf of Maine 


1981 1982 1983 1984 1985 1986 1987 1988 1989 


a 


1990 1991 1992 1993 1994 1995 1996 1997 1998 


| 2029 | 2030 _| 
por Tt 28132] 36 | 140 | 


Deviation (57) from average seawater temperature (T,°C) in the Gulf of Maine. Grey shading indicates projected temperatures. Data are from a previous study’. 
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Extended Data Table 6 | Food web model algorithms 


Parameter 


dG; 
dt 


kp 


ky 


Unit 


ng g' 


fom 


g 


unitless 
fom 


Ld? 
unitless 


fom 


unitless 
unitless 
gat 

Cc 

mg O2L*t 


unitless 
unitless 
unitless 
unitless 


Description 


Change in concentration of MeHg in predator 
species (i) 


MeHg dietary uptake rate 
MeHg water/gill ventilation uptake rate 


Melg elimination rate 
MeHg growth dilution rate 


time 
wet weight of predator fish (i) 


dietary MeHg absorption efficiency 
Rate of consumption of prey (j) 


concentration of MeHg in prey species (/) 
gill ventilation/ water filtration rate 


absorption efficiency for MeHg from seawater 


Melg elimination rate slope 


Melg elimination rate intercept 

temperature coefficient 

growth rate 

seawater temperature 

dissolved oxygen concentration as a function of 
temperature 

oxygen saturation of the water column 

constant 

constant 

octanol-water partition coefficient for CHsHgCl 


Equation or Value 


{kp + ky — (ke + kg)} °C; 


Ape 
E J Gi 

Gy 
“C ‘i ———— 
WC Mi 


ag » MY - eCeT) 


Mi 
variable 


Modeled based on bioenergetics equations 


Uniformly distributed between 0.75 and 0.95 
Species-specific model parameter 


Modeled 
(M; ‘ 1073) 965 


1400- 
Cox 


Modeled based on bioenergetics equations 
Extended Data Table 5 
(-0.24-T + 14.04) « Sox 


0.9 
1.87 
155 
1.7 


Model algorithms for MeHg accumulation in marine predators. Additional background information is provided in the Supplementary Information and in our previous study®. 
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Extended Data Table 7 | Trophic interactions in the food web model 
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Feeding preferences of predator species in the Gulf of Maine. All feeding preferences are based on a synthesis of stomach-contents data by the Northeast Fisheries Science Center (NEFSC)’. 


«Predator no. 25 is ABFT and predator no. 26 is swordfish. 
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The decoupled nature of basal metabolic rate and 
body temperature in endotherm evolution 


Jorge Avaria-Llautureo!*, Cristian E. Hernandez’, Enrique Rodriguez-Serrano* & Chris Venditti!* 


The origins of endothermy in birds and mammals are important 
events in vertebrate evolution. Endotherms can maintain their 
body temperature (T)) over a wide range of ambient temperatures 
primarily using the heat that is generated continuously by their 
high basal metabolic rate (BMR)!. There is also an important 
positive feedback loop as T, influences BMR!~*. Owing to this 
interplay between BMRs and T}, many ecologists and evolutionary 
physiologists posit that the evolution of BMR and T;, must have 
been coupled during the radiation of endotherms*~>, changing with 
similar trends®*®, However, colder historical environments might 
have imposed strong selective pressures on BMR to compensate 
for increased rates of heat loss and to keep T;, constant?!*. Thus, 
adaptation to cold ambient temperatures through increases in 
BMR could have decoupled BMR from T; and caused different 
evolutionary routes to the modern diversity in these traits. Here 
we show that BMR and J; were decoupled in approximately 90% of 
mammalian phylogenetic branches and 36% of avian phylogenetic 
branches. Mammalian BMRs evolved with rapid bursts but 
without a long-term directional trend, whereas T;, evolved mostly 
at a constant rate and towards colder bodies from a warmer-bodied 
common ancestor. Avian BMRs evolved predominantly at a constant 
rate and without a long-term directional trend, whereas T,, evolved 
with much greater rate heterogeneity and with adaptive evolution 
towards colder bodies. Furthermore, rapid shifts that lead to both 
increases and decreases in BMRs were linked to abrupt changes 
towards colder ambient temperatures—although only in mammals. 
Our results suggest that natural selection effectively exploited 
the diversity in mammalian BMRs under diverse, often-adverse 
historical thermal environments. 

Phylogenetic statistical methods!*"* provide us with the opportunity 
to formally test whether BMR has been linked to Ty or ambient temper- 
ature (T,) throughout the evolution of birds and mammals. By accom- 
modating for and identifying heterogeneity in the rate of phenotypic 
evolution, these methods can detect and reconstruct accurate histori- 
cal evolutionary processes'’. Evaluation of the evolutionary coupling 
between BMR and Ty has direct consequences for several longstanding 
ecological and evolutionary theories” * (including the metabolic theory 
of ecology) that assume coupling between BMR and Th. 

We first quantified and compared rates of evolution for BMR and T} 
along each branch of the time-calibrated phylogenetic trees of birds 
and mammals (hereafter, branch-wise rates (r); Methods). r is a rate 
scalar by which the background rate of evolution (o’,) is multiplied to 
increase or decrease the pace of evolution; it measures how fast a trait 
evolved along an individual phylogenetic branch (Methods). If BMR 
and T), were coupled during the evolution of endotherms, the amount 
of change along phylogenetic branches for both traits should be posi- 
tively associated—in cases in which rpgmp is high, we expect it to be high 
for r;, (Fig. 1 b). We tested this prediction against alternative evolution- 
ary scenarios. First, we cannot make any inferences about coupling or 
decoupling in cases in which there is no rate heterogeneity for both 


BMR and 7, (r = 1 for all branches in the tree for both traits) (Fig. 1a). 
Second, we infer decoupled evolution if both traits show rate heteroge- 
neity, for which the magnitudes of r values are negatively correlated 
(that is, branches that evolve at a high rate for BMR but a low rate for 
Tp, and vice versa) (Fig. 1c). We suggest this scenario indicates decou- 
pled evolution because a negative correlation most probably indicates 
that one trait tends to be conserved while the other evolved rapidly. 
Third, we infer decoupled evolution if only one trait shows rate heter- 
ogeneity while the other evolved at a constant rate (Fig. 1d, e) or ifboth 
traits show heterogeneity but the branch-wise rates are not associated 
(Fig. 1f). 

As BMR, body mass (M), T; and T, are—at least to some extent— 
correlated in extant birds and mammals, and such correlations may 
vary between orders!°, we estimated the branch-wise rates for BMR 
and 7}, while accounting for their covariates across extant species 
using the phylogenetic variable-rate regression model!” (Methods). 
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Fig. 1 | Possible evolutionary scenarios for BMR and T;, given their 
branch-wise rates in a bivariate space. a, Both traits evolve at a single 
constant rate across all branches of the tree (rgmr = 1 and r;, = 1); in this 
case, we have no statistical power to evaluate an association between BMR 
and Ty. b, A positive correlation between rgyp and 1, indicates that both 
traits are coupled—in cases in which BMR changed more, Ty also changed. 
c, A negative correlation between rgmr and r,, implies that both traits are 
decoupled because when BMR changed more, Tj, changed less. d-f, 
Correlations indicate that both traits are decoupled—when BMR evolved 
at a single constant rate, T, evolved at a variable rate (d) or vice versa (e); 
or both traits evolved at variable rates (7g, # land 1, = 1) but their 
magnitudes were not statistically correlated (f). Grey colour represents the 
constant background rate (r = 1). Red colours represent rates that are 
faster than the background rate (r > 1) and blue colours represent rates 
that are slower than the background rate (r < 1), which might be related to 
past events of positive'” and stabilizing selection‘, respectively. Point fill 
colours represent the magnitudes of rgyp and point outline colours 
represent magnitudes of r,. 
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Fig. 2 | Branch-wise rates of BMR, T, and T, on the mammalian and 
avian phylogeny. a—c, Branch-wise rates for mammalian BMR (a), Ty (b) 
and T, (c). d-f, Branch-wise rates for avian BMR (d), Ty (e) and T, (f). The 
r values for each phylogenetic branch are shown in colours, and indicate 
whether the trait evolved at a constant background rate (r = 1, grey 
branches), at rates slower than the background rate (r < 1, blue-gradient 
branches) or at rates faster than the constant rate (r > 1, red-gradient 
branches). All silhouettes were obtained from http://phylopic.org/. 
Mammalian silhouettes were created by the following individuals 

(from top to bottom): Monotremata, S. Werning; Marsupialia, 

M. Callaghan; Hyracoidea, by S. Traver; Tubulidentata, P. Scott; 
Macroscelidea (uncredited); Pilosa, FunkMonk; Eulipotyphla, B. Barnes; 


This approach enables the simultaneous estimation of both an over- 
all relationship between—for instance—BMR as a function of M 
and Tj across extant species, and any shifts in branch-wise rates 
that apply to the phylogenetically structured residual variance in 
the relationship. In both birds and mammals, the phylogenetic vari- 
able-rate regression model fits the data significantly better than the 
constant-rate regression models, which assume a single constant rate 
(r = 1) across all branches (Methods and Supplementary Tables 1-8). 
The best-fitting phylogenetic variable-rate regression model for 
mammalian BMR includes both M and Ty with a single slope for each 
trait that is estimated across all orders (Supplementary Tables 1, 2). 
For mammalian 7}, the best-fitting model includes M and BMR as 
covariates, also with a single slope across all orders (Supplementary 
Tables 3, 7). In birds, the best model for BMR includes only M, with 
a single slope for all orders (Supplementary Table 4). Finally, the 
best-fitting model for avian T; includes M only in Columbiformes 
(Supplementary Table 6). 
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Artiodactyla, nicubunu; Pholidota, S. Traver; Carnivora (uncredited); 
Chiroptera, Y. Wong; Scandentia, T. M. Keesey; Primates, 

T. M. Keesey; Lagomorpha, A. Caravaggi; and Rodentia (uncredited). Avian 
silhouettes were created by the following individuals from top to bottom: 
Anseriformes, M. Martyniuk; Galliformes (uncredited); Columbiformes, 

E Sayol; Podicipediformes, D. Backlund; Procellariiformes, M. Hannaford; 
Suliformes, F. Sayol; Pelecaniformes, S. Traver; Cuculiformes, F. Sayol; 
Gruiformes, F. Sayol; Caprimulgiformes, F. Sayol; Apodiformes, F. Sayol; 
Charadriiformes (uncredited); Accipitriformes, S. Traver; Bucerotiformes, 
S. Traver; Coraciiformes, F. Sayol; Piciformes, S. Traver; Strigiformes, 

E Sayol, Coliiformes, E. J. Wetsy; Falconiformes, R. Groom; Psittaciformes, 
E Sayol; and Passeriformes, P. Pattawaro. 


The branch-wise rates estimated for the best-fitting models show 
that mammalian BMR evolved at a constant rate (r = 1) in only 11.2% 
of branches and at faster rates (r > 1) in 88.8% of branches (Fig. 2a). 
Mammalian T;, evolved at a constant rate in 70.3% of branches and 
faster rates in 29.7% of branches (Fig. 2b). In birds, BMR evolved 
at a constant rate in 90.5% of branches and at faster rates in 9.5% 
of branches (Fig. 2d). Avian T; evolved at a constant rate in 69% of 
branches and at faster rates in 31% (Fig. 2e). When the branch-wise 
rates for BMR and Ty were compared, we found that in mammals both 
traits evolved at a constant rate in 10.6% of branches (Fig. 3a, consistent 
with Fig. 1a). In 60.2% of branches, only one trait evolved at faster rates 
while the other trait diverged at a constant rate. This indicates that BMR 
and Ty) evolved in a decoupled manner along these branches (Fig. 3a, 
consistent with Fig. 1d, e). We found that 29.2% of branches had an 
increased rate for both BMR and Ty. However, the magnitudes of the 
branch-wise rates were not significantly correlated (the percentage of 
the posterior distribution crossing zero as assessed by Bayesian Markov 
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Fig. 3 | Branch-wise rates of BMR, T; and T, in bivariate space for 


mammals and birds. a, b, Bivariate space of mammals for rgmp and r,, (a) 
OF fr, (b). ¢, d, Bivariate space of birds for rgmr and 17, (c) or rp (d). a, In 
mammals, rpmr was decoupled from r,, in 89.4% of branches Pecunse 
either only one trait showed rate heterogeneity while the other evolved a 
single constant rate (in 60.2% of branches; grey filled and red outlined 
dots, and grey outlined and red filled dots, consistent with Fig. 1d, e), or 
because both traits evolved at fast rates but the magnitudes of rgyr and r, 
were not correlated (in 29.2% of branches; red filled and outlined dots, 
consistent with Fig. 1f). In the remainder of the branches, 10.6%, (grey 
middle dot, consistent with Fig. 1a) there was no variation in either rgup or 
rr,. b, Bayesian generalized least squares analyses indicate that fast raqmr 
and slow to fast Tr, (red filled and blue and red outlined dots) were 
statistically correlated in 74.9% of mammalian branches (Pmcmc = 03 

n = 602 branches; black line). In 18.2% of branches, the rgyp was 
decoupled from r, because only one trait shows rate heterogeneity (grey 
filled and red outlined dots and grey outlined and red filled dots). In the 
remainder of the branches, 6.9%, (grey middle dots), there was no 
variation in either rgyr or TG In birds, rg was decoupled from r,, in 
36.2% of branches because either only 1 trait showed rate heterogeneity (in 
32% of branches) or because the magnitude of fast rates in both traits were 
not correlated (in 4.2% of branches). There was no rate variation for either 
trait for the remaining 63.8% of branches. d, Avian rgmr was 

decoupled from ry in 77.9% of branches, because either only one trait 
showed rate heterogeneity (in 68.4% of branches) or because the 
magnitude of fast rates in both traits were not correlated (in 9.5% of 
branches). There was no variation in either trait for the remaining 22.1% of 
branches. 


chain Monte Carlo (MCMC), Pacoc = 9%) (Fig. 3a, consistent with 
Fig. 1f; Supplementary Table 9). This also suggests that evolution was 
decoupled in those branches—probably because of distinct selection 
pressures that acted separately on BMR and T}. On the other hand, 
both traits evolved at a constant rate in 63.8% of branches for birds 
(Fig. 3c, consistent with Fig. 1a). In 32% of branches, only one trait 
evolved at fast rates while the other trait diverged at a constant rate 
(Fig. 3c, consistent with Fig. 1d, e). In the remaining 4.2% of branches, 
both traits evolved at faster rates, but the magnitudes of r were not sta- 
tistically correlated (Pacmc = 16.9%) (Fig. 3c, consistent with Fig. 1f 
Supplementary Table 10). 

As rapid bursts in the evolution of BMR were not coupled with the 
evolutionary changes in T), we evaluated the alternative hypothesis that 
postulates that BMR evolved in response to T,. This hypothesis suggests 
that colder environments increase the rate of heat loss from organ- 
isms and that this loss is subsequently compensated for by increases 
in BMR®!”. These increases in BMR could have occurred over long 
periods of time because of global cooling'*’—generating a long-term 
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directional trend in BMR during the radiation of mammals and birds. 
This expectation is consistent with the plesiomorphic-apomorphic 
endothermy model®*. By assuming that BMR and Ty are coupled in 
endotherms and that they can both be used as a proxy for the degree 
of endothermy, the plesiomorphic-apomorphic endothermy model 
predicts a general tendency towards higher endothermic levels over 
time (from basoendothermic ancestors; Methods) associated with the 
global cooling during the Cenozoic era. However, global cooling is not 
the only source of variation in T,. Long-term directional increases in 
BMR may have also been driven by historical dispersals of endotherms 
towards higher latitudes’. In either case, if a long-term decrease in T, 
drove adaptation through increases in BMR, and Ty followed the same 
trajectory (as assumed by the plesiomorphic-apomorphic endothermy 
model), we expect to find a positive correlation between the branch- 
wise rates of BMR and the branch-wise rates of T,. With this in mind, 
we also expect a positive trend towards higher values of BMR and Ty 
for basoendothermic ancestors and a negative trend towards lower T, 
for warmer ancestral environments. We used the phylogenetic varia- 
ble-rate regression model to estimate the branch-wise rates for T, while 
accounting for latitude as, generally, T, decreases from the equator to 
the poles (Methods and Supplementary Table 11). 

The phylogenetic variable-rate regression model significantly 
improved the fit to the T, data over the constant-rate regression model 
in both mammals and birds (Supplementary Table 11). T, evolved at 
a constant rate in 21.2% of mammalian branches, and with rate heter- 
ogeneity in the remaining 78.8%—including 72.2% of branches with 
faster rates and 6.6% with slower rates (r < 1) (Fig. 2c). This indicates 
that most ancestral mammalian lineages (72.2%) faced abrupt histor- 
ical changes in their T, environment, while far fewer lineages (6.6%, 
most of which were bats) survived and continued to exist in similar 
thermal environments. In birds, 77.6% of branches show faster rates of 
change in T,, 22.1% show changes at a constant rate and in only a single 
branch did the T, change at a slower rate (Fig. 2f). 

When branch-wise rates of mammalian BMR and T, evolution were 
compared, we found that they were coupled in 74.9% of branches 
(Pmcmc = 0%) (Fig. 3b, consistent with Fig. 1b; Supplementary 
Table 12). To evaluate further whether decreases in T, were linked to 
increases in BMR in the 74.9% of mammals for which both traits were 
coupled (that is, to ascertain the direction of change), we evaluated the 
expected positive trend in BMR as a response to the long-term decrease 
in T,. We conducted Bayesian phylogenetic regressions between extant 
values of these two variables (in turn) and the path-wise rates (sum 
of rate-scaled branches along the path from the root of the tree to each 
terminal species; Methods). We found a negative effect of path-wise 
rates on T, across all mammals (Fig. 4b and Supplementary Table 14), 
which supports a long-term directional trend towards habitats with 
lower T, over time. However, we did not find evidence for any trend in 
mammalian BMR evolution—increases and decreases in BMR showed 
equal probabilities in our sample (Supplementary Table 14). Our results 
suggest that in colder environments, in which resources were availa- 
ble to fuel metabolic elevation, selection favoured higher mammalian 
BMR”. Another possibility might be that the increase in BMR was a 
correlated response to direct selection on other physiological traits, 
such as the maximum metabolic capacities for thermogenesis, for 
which the benefits outweigh the energetic cost of BMR elevation”. 
Otherwise, selection may have always favoured decreases in BMR in 
an ever-colder environment”. 

In contrast to mammals, most avian branches that experienced 
rapid shifts in T, did not show evidence for coupled changes in BMR— 
68.4% of branches had fast rates of T, evolution but a constant rate 
of BMR evolution (Fig. 3d, consistent with Fig. 1d, e). Moreover, the 
small fraction of branches for which BMR evolved at fast rates (9.5%) 
were not linked to rapid shifts in T, (Fig. 3d, consistent with Fig. 1f; 
Supplementary Table 13). Avian BMR did not show a positive evolu- 
tionary trend despite the fact that birds also experienced colder envi- 
ronments over time (Fig. 4d and Supplementary Table 15). Birds might 
not have responded to colder temperatures by changes in their BMR 
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their evolutionary history. a-d, Path-wise rates had a significant negative 
effect on mammalian Ty, (a; Pacamc = 4%; n = 502 species) and avian Ty 
(c; Pacmc = 3%; n = 367 species) and on mammalian T, (b; Pacmc = 03 

n = 2,922) and avian T, (d; Pucmc = 0; n = 6,142 species), supporting a 
negative macroevolutionary trend’ for both T, and T, in mammals and 
birds. Lighter blue and dark blue lines indicate the posterior distribution 
of slopes and the mean slope, respectively, estimated from the Bayesian 
phylogenetic generalized least squares (Methods). 


because their lower thermal conductance may have helped them to 
retain internal heat®. Alternatively, other physiological strategies, such 
as torpor, may have been selected for in colder environments?!. 

Finally, we found a negative effect of path-wise rates on Ty in both 
mammals (Fig. 4a and Supplementary Table 14) and birds (Fig. 4c and 
Supplementary Table 15). This suggest that—on average—endotherms 
evolved towards colder bodies from warmer-bodied ancestors. These 
directional models predict a mean Tp of 35.3°C and 40.4°C for the most 
recent common ancestor (MRCA) of mammals and birds, respectively 
(Fig. 4a, c), suggesting that early birds and mammals were mesoendo- 
therms rather than basoendotherms (Methods). This result does not 
support the idea that ancestral mammals could not attain T, > 30°C 
owing to the increased metabolic rates that would be necessary to com- 
pensate for heat loss in cold environments”. However, if the T, — T, 
differential (AT) determines how hot early mammals were, we expect 
that a mammalian MRCA with a Ty of 35.3°C could survive in an 
environment that was warm enough to have a low AT. Our model that 
describes the negative trend in T, predicts that the MRCA of mammals 
lived in an environment that was 23 °C on average (Fig. 4b), resulting 
in a AT of 15.3 °C. This ancestral AT is very conservative compared 
with the AT values that have been observed in extant mammals. For 
example, there are small mammals that achieve a T, higher than 39°C 
(such as Microdipodops pallidus!®) and that can survive in environments 
of 11°C)? (AT = 28°C). Furthermore, some larger mammals have a 
stable T;, even in extreme environmental conditions—the Arctic hare 
(Lepus arcticus) can maintain its T, of 38°C!* in temperatures as low 
as —12°C (AT = 50°C). 

Taken together, our results show that BMR was not coupled to Ty 
across the evolution of endothermic species. As environments became 
colder mammals survived by changing their BMR, while birds probably 
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survived owing to their high thermal insulation. Evaluating the iso- 
lated and/or combined effects of environmental variables on physio- 
logical attributes has implications for evidence-based projections for 
the future’. In this sense, the previously unappreciated complexity, 
interplay and decoupled nature of the evolutionary history of BMR, 
Tp and T, may point to the undetected resilience of endotherms in the 
face of modern global challenges. 
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METHODS 

Data. We used a time-calibrated phylogenetic tree of extant mammals 
(n = 3,321)”, and data for M, BMR and Ty were obtained from a previously 
published study'® (n = 632). After identifying species in the tree that have trait 
information, we obtained a final mammalian dataset of 502 species, which 
includes representatives from 15 orders (Supplementary Information). 

For birds, we used the consensus time-calibrated tree from a previous study”. 
This tree was inferred from the samples of trees that have previously been pub- 
lished?°. Data for BMR, T;, and M were obtained from a previously published 
study’. After matching this database with the phylogenetic tree, we obtained 
a final sample of 164 species, which includes representatives from 21 orders 
(Supplementary Information). The dataset used to evaluate evolutionary trends 
in Ty (see below) has previously been published”’, and contains 367 species with 
phylogenetic information. 

Data for T, and latitude for extant mammals and birds were extracted from 
a previous publication!’. These datasets include 2,922 species of mammals and 
6,142 species of birds, which have phylogenetic information. The T, for extant 
endothermic species is the temperature of the environments that birds and mam- 
mals inhabit today—measured as the mean ambient temperature for the mid-point 
latitude of each species distribution’’. The T, at which a species exists today may 
not be a heritable trait per se. However, the evolution of T, can still be inferred 
using phylogenetic methods as habitat selection reflects adaptations of the spe- 
cies (traits) to some characteristics of the environment. This interrelationship 
should leave a phylogenetic signal in the T, at which endothermic species live. 
Accordingly, we found a significant phylogenetic signal in the T, of both mammals 
(AposteriorMean = 0.77; Bayes factor = 665) and birds (AposteriorMean = 0.8; Bayes fac- 
tor = 1,404). Furthermore, the phylogenetic signal for T, is very high (A = 1) in 
birds and mammals when estimated using the median-r scaled tree. 

Finally, to evaluate the endothermic levels for the MRCA of mammals and 

birds that have previously been proposed”*, we followed this categorization of 
endothermic species: as basoendotherms (T;,3*5 < 40.4°C; T,M#™™s < 35.0°C), 
mesoendotherma (40.4°C < Ty < 42.5°C; 35°C < T,Mamms < 37.9°C) and 
supraendotherms (TPs > 42.5°C; T,Mam™ls & 37.9°C). 
Inferring the branch-wise rates of evolution. We identified heterogeneity in the 
rate of evolution along phylogenetic branches (branch-wise rates) by dividing the 
rate into two parameters: a background rate parameter (0), which assumes that 
changes in the trait of interest (for example, BMR) are drawn from an underlying 
Brownian process, and a second parameter, r, which identifies a branch-specific 
rate shift. A full set of branch-wise rates are estimated by adjusting the lengths 
of each branch in a time-calibrated tree (stretching or compressing a branch is 
equivalent to increasing or decreasing the phenotypic rate of change relative to 
the underlying Brownian rate of evolution). Branch-wise rates are defined by a 
set of branch-specific scalars r (0 < r < 00) that scale each branch to optimize the 
phenotypic rate of change to a Brownian process (0”, x r). If phenotypic change 
occurred at accelerated (faster) rates along a specific branch of the tree, then r > 1 
and the branch is stretched. Decelerated (slower) rates of evolution are detected 
by r < land the branch is compressed. If the trait evolves at a constant rate along 
a branch, then the branch will not be modified (that is, r= 1). 

We estimated the r values of evolution for BMR, Tj, and T, using the phyloge- 
netic variable-rate regression model in a Bayesian framework!’. This model is 
designed to automatically detect shifts in the rate of trait evolution across phy- 
logenetic branches while accounting for a relationship with another trait or traits 
across values for extant species. This approach enables the simultaneous estima- 
tion of both an overall relationship between—for instance—BMR as a function 
of M and Ty across extant species, and any shifts in the rate r that apply to the 
phylogenetically structured residual variance in the relationship. As residual var- 
iance is explained by shifts in rate across phylogenetic branches (r) we can, for 
example, determine how much BMR has changed in the past after accounting for 
its covariation with M and Ty in the present (the relationship between the values 
across extant species). Thus, if the amounts of change in BMR along individual 
phylogenetic branches were coupled with the amounts of change of Tp, then we 
should find the rgyr values to be positively associated with the r,, values. The 
branch-wise rates for T, evolution can be estimated while accounting for its covar- 
iation with other traits or factor across extant species. Previous studies on the 
association between BMR and J; that only used values for extant species have not 
evaluated the association in evolutionary terms, even when they use phylogenetic 
methods. 

We evaluated 24 phylogenetic variable-rate regression models and 24 phyloge- 
netic constant-rate regression models (Supplementary Tables 1-8). The selection 
of the regression model was conducted using Bayes factors (B) using marginal 
likelihoods estimated by stepping stone sampling. B is calculated as the double of 
the difference between the log marginal likelihood of the complex model and the 
simple model. By convention, B > 2 indicates positive evidence for the complex 
model, B = 5-10 indicates strong support and B > 10 is considered very strong 
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support”*. We inferred the rgur and r;, values with the phylogenetic variable-rate 
regression models that best fit the data for our samples of mammals and birds 
(Supplementary Tables 7, 8). We also estimated the r, values after accounting for 
the effect of the latitude of the distribution of species (Supplementary Table 11) 
and, consequently, we accounted for the geographical variation of T, across the 
distribution of extant species. We used BayesTraits v.3.0”° to detect the magnitude 
and location of r in a Bayesian MCMC reversible-jump framework, which gener- 
ates a posterior distribution of trees with scaled branches lengths according to the 
rate of evolution. There is no limit or prior expectation in the number of the r 
branch scalars, r numbers vary from zero (no branch is scaled) to n, in which n is 
the number of branches in the phylogenetic tree. Regarding the values of each r 
parameter, we used a gamma prior, with a = 1.1 and a (@ parameter that is rescaled 
such that the median of the distribution is equal to 1. With this setting, the numbers 
of the rate increases and decreases that are proposed are balanced!*. We ran 
50,000,000 iterations sampling every 25,000 to ensure chain convergence and inde- 
pendence in model parameters in BMR and Ty, analyses. We discarded the first 
25,000 iterations as burn-in. For the T, analysis in mammals, we ran 200,000,000 
iterations sampling every 100,000, and we discarded the first 100,000 iterations as 
burn-in. For T, analysis in birds, we ran 400,000,000 iterations discarding the first 
100,000,000 as burn-in, and we sampled every 200,000. Regression coefficients 
were judged to be significant according to a calculated Paco value for each pos- 
terior of regression coefficients for cases in which <5% of samples in the posterior 
distribution crossed zero; this indicates that the coefficient is significantly different 
from zero. 

Testing the relationship between the branch-wise rates of evolution. We first 
estimated the consensus branch-scaled tree for BMR and T) from the posterior 
sample of branch-scaled trees obtained with the phylogenetic variable-rate regres- 
sion model. The consensus branch-scaled tree was generated by using the median 
r from the posterior distribution. We evaluated the correlation between the rgur 
and Tr, values using a Bayesian generalized least squares regression in BayesTraits 
v.3.0. The same analyses were conducted to evaluate the correlation between rgqr 
and r,. We used a uniform prior for the ( (slope coefficient), which ranged from 
—100 ‘to 100. We ran 50,000,000 iterations sampling every 25,000 to ensure chain 
convergence and independence in model parameters. We discarded the first 25,000 
iterations as burn-in. Significance of regression coefficients was determined as 
above. 

Detecting trends. We evaluated the direction of change in BMR, Ty and T, across 
all mammals and birds using the path-wise rates of these variables (Supplementary 
Tables 15, 16). The path-wise rate is the sum of all of the rate-scaled branches along 
the path of a species, which lead from the root (the MRCA) to the tips of the tree, 
and it accounts for the total amount of change that the species has experienced 
during its evolution’. If high path-wise rates have disproportionately been asso- 
ciated with trait increases or decreases, we expect to find that species with greater 
path-wise rates will have high or low trait values in the present. For instance, if 
ancestral mammals experienced progressively colder environmental temperatures 
owing to climate change or colonization of colder habitats as they were evolving 
from their MRCA, we expect a negative correlation between the path-wise rate 
of T, and the T, of extant species. We performed six Bayesian PGLS regressions 
in BayesTraits v.3.0 to evaluate the relationship between BMR, Th), T, and their 
path-wise rates (Supplementary Tables 15, 16). We used a uniform prior for the 
G (slope coefficients) that ranged from — 100 to 100 to allow all possible values to 
have an equal probability. Finally, we ran 50,000,000 iterations sampling every 
25,000 to ensure chain convergence and independence in model parameters. We 
discarded the first 25,000 iterations as burn-in. Significance of regression slopes 
was determined as above. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 
No new data were generated for this study. The data used for this paper are available 
from the original sources cited in the Methods and Supplementary Information. 
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Wnt and TGF®@ coordinate growth and patterning to 
regulate size-dependent behaviour 


Christopher P. Arnold!*, Blair W. Benham-Pyle”, Jeffrey J. Lange!, Christopher J. Wood! & Alejandro Sanchez Alvarado!?* 


Differential coordination of growth and patterning across 
metazoans gives rise to a diversity of sizes and shapes at tissue, organ 
and organismal levels. Although tissue size and tissue function 
can be interdependent!~°, mechanisms that coordinate size and 
function remain poorly understood. Planarians are regenerative 
flatworms that bidirectionally scale their adult body size®’ and 
reproduce asexually, via transverse fission, in a size-dependent 
manner®*!°. This model offers a robust context to address the gap in 
knowledge that underlies the link between size and function. Here, 
by generating an optimized planarian fission protocol in Schmidtea 
mediterranea, we show that progeny number and the frequency of 
fission initiation are correlated with parent size. Fission progeny size 
is fixed by previously unidentified mechanically vulnerable planes 
spaced at an absolute distance along the anterior-posterior axis. An 
RNA interference screen of genes for anterior-posterior patterning 
uncovered components of the TGF@ and Wnt signalling pathways 
as regulators of the frequency of fission initiation rather than the 
position of fission planes. Finally, inhibition of Wnt and TGFB 
signalling during growth altered the patterning of mechanosensory 
neurons—a neural subpopulation that is distributed in accordance 
with worm size and modulates fission behaviour. Our study 
identifies a role for TGFS and Wnt in regulating size-dependent 
behaviour, and uncovers an interdependence between patterning, 
growth and neurological function. 

The infrequency of planarian fission behaviour has largely pre- 
cluded its mechanistic dissection. However, recently optimized worm 
husbandry techniques augmented fission activity'’’?, and enabled us 
to study the integration of worm size with fission behaviour. Large 
planaria (Schmidtea mediterranea) from recirculation culture systems 
exhibited robust and reproducible increases in fission activity 
when transitioned to static culture systems and starved (Fig. 1a, 
Supplementary Video 1). Live imaging provided detailed character- 
ization of the fission process. Planarians first elongate and adhere 
their posterior tissue to a substrate. Next, periodic body contractions 
concentrate body mass towards the head region while thinning out 
tissues immediately anterior to the adherent tail. After 20-40 minutes, 
progressive stretching ruptures connecting tissue with rapid recoil, 
which separates the anterior parent from the posterior fission progeny 
(Extended Data Fig. 1a, Supplementary Video 1). 

Observation of fission behaviour in worms of increasing size showed 
that the length of first posterior fission fragments did not correlate with 
parent length (Fig. 1b, d). Instead, larger worms produced additional 
progeny, each approximately 1 mm in length, such that the number of 
progeny after 2 weeks linearly correlated with parent size (Fig. Ic, e, 
Extended Data Fig. 1b-d). Thus, the size of fission fragments is fixed 
independently of anterior—posterior position or parent length. The 
frequency of the production of fission fragments—that is, the fission 
rate—did correlate with worm length (Extended Data Fig. le, f), and 
both the time to the first fission event and the time between sequen- 
tial fission events was inversely related to parent size (Extended Data 
Fig. 1g—-1). Automated webcam imaging of individual worms allowed 


us to generate timelines chronicling successful (upward displacement) 
and unsuccessful (downward displacement) fission attempts (Fig. 1f, 
Supplementary Video 2). Fission attempts occurred only in worms 
above 4-5 mm in length, which indicates a minimal size required 
for fission (Fig. 1g, h, Extended Data Fig. 2a, b). Furthermore, larger 
worms produced fission progeny more frequently owing to more fission 
attempts (Fig. 1h, Extended Data Fig. 2c, d), rather than higher rates of 
success (Fig. li). Together, these results confirm that planarian fission 
is a size-dependent behaviour, with both progeny number and fission 
rate coupled to parent size. 

We tested the hypothesis that patterning cues are required to coor- 
dinate worm size and planarian fission. Genes from the Wnt!3-!6, 
TGF8'7-!? and Hh” signalling pathways that regulate anterior— 
posterior identity were screened using RNA-dependent genetic inter- 
ference (RNAi) techniques”! (Fig. 2a, b, Extended Data Fig. 3a, b). 
Rescreening confirmed six presumptive activators of fission (actR-1, 
smad2/3, 3-catenin, dsh-B, tsh and wnt1 1-6) and a presumptive inhib- 
itor (apc) (Fig. 2c). The morphology of parent worms was observed at 
days 0 and 14 of the fission assay and in regenerating tissue fragments. 
RNAi knockdown reproduced published anterior—posterior patterning 
defects in regenerating tissue fragments (Extended Data Fig. 4a), but 
few morphological defects were observed in parent worms (Fig. 2d). On 
day 0, 3-catenin RNAi worms exhibited morphological abnormalities, 
whereas other RNAi conditions were indistinguishable from controls. 
By day 14, RNAi of actR-1 and smad2/3 elicited motility defects, but 
RNAi of dsh-B, wnt11-6, tsh and apc significantly altered fission rates 
without changes in morphology. In situ staining of the central nervous 
system (CNS), intestine and muscle confirmed published anterior- 
posterior polarity regeneration phenotypes, but no gross morphological 
defects in parent RNAi worms (Extended Data Fig. 4b-d). Therefore, 
we conclude that Wnt and TGF@ signalling components modulate fis- 
sion behaviour independently of overt body plan repolarization. 

Serendipitously, we discovered that compression of planaria reveals 
cryptic mechanically vulnerable planes that divide the worm at reg- 
ularly spaced intervals along the anterior—posterior axis (Fig. 3a, b, 
Supplementary Video 3). The number of these ‘compression planes’ 
scaled with worm size (Fig. 3b, c) and their position along the anterior- 
posterior axis overlapped with the position of fission planes (Fig. 3d). 
Furthermore, incomplete fission formed tears similar to those observed 
with compression (Extended Data Fig. 5a). Therefore, we conclude 
that compression planes are fission planes revealed by mechanical 
compression. Fission plane number and distribution correlated with 
worm length during tissue rescaling and regeneration. After starvation, 
worms reduced body length and lost fission planes to restore number 
and distribution (Extended Data Fig. 5b-d). To assay regeneration of 
the fission plane, we amputated worms around the pharynx such that 
90% of fragments contained a single plane (Extended Data Fig. 5e-g). 
One week after amputation, worms remodelled, doubled in length and 
increased fission plane number (Extended Data Fig. 5f-j). Subsequent 
feeding increased worm length and fission plane number (Extended 
Data Fig. 5f-j). After starvation, worms exhibited little to no elongation 
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Fig. 1 | Planarian fission is a size-dependent behaviour. a, Optimized 
fission protocol. b, c, Representative images of 5-12-mm worms and 
fission fragments less than 24 h (b) and 14 days (c) after the first fission 
event (n = 46 worms from 1 experiment). Scale bars, 5 mm. 

d, e, Length of the first fission fragments (d) and progeny number (e) 
over 2 weeks relative to parent length (n = 46 (d) or 30 (e) worms from 
one experiment). f, Webcam live-imaging schematic (left) and example 


or plane addition despite rescaling and regenerating their other 
tissues (Extended Data Fig. 5f-j). In summary, fission planes are pre- 
established in planarians and correlate dynamically with worm size 
and form. 

Given the role of Wnt and TGF@ signalling in body patterning, we 
tested whether genes of these signalling pathways regulate fission 
planes. Worms treated with RNAi were mechanically compressed and 
the quantity and relative distribution of fission planes was measured 
(Fig. 3e-g). Notably, whereas RNAi of actR-1 and smad2/3 moderately 
reduced the number of fission planes, RNAi of Wnt signalling com- 
ponents had no effect on fission plane number or position (Fig. 3e, g, 
Extended Data Fig. 6a). Even knockdown of wnt11-6 by three rounds 
of amputation and regeneration did not alter fission-plane patterning 
(Fig. 3f, g, Extended Data Fig. 6b). Hypomorphic RNAi knockdown of 
G-catenin, actR-1 or smad2/3 revealed little or no effect on the size of 
fission fragments (Extended Data Fig. 6c—e), which further supports 
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timeline depicting successful (middle) and unsuccessful (right) fission 
attempts. g, Representative fission behaviour timelines from a range of 
parent lengths. h, i, Total fission attempts (h) and successful attempts per 
total attempts (i) relative to parent length (m = 39 (h) and 21 (i) worms). 
Data are from a single experiment. Pearson correlation co-efficient (PCC), 
linear regression (red line), and R? values are provided. 


the conclusion that neither Wnt nor TGF@ signalling regulate fission 
behaviour through the anterior—posterior patterning of fission planes. 

We tested whether Wnt and TGF@ signalling instead regulated the 
frequency of fission attempts. Using the automated webcam image- 
capture system (Fig. 1f), we recorded fission behaviour in RNAi- 
treated worms (Fig. 4a). RNAi of (3-catenin, actR-1, smad2/3 and 
wnt11-6 reduced fission attempts, whereas RNAi of apc increased 
fission attempts (Fig. 4b-d, Extended Data Fig. 7a-l, Supplementary 
Videos 4-6). RNAi of G-catenin and smad2/3, which resulted in observ- 
able morphological abnormalities, also significantly reduced the 
fission-success ratio (Figs. 2d, 4e, Extended Data Fig. 7k-n). dsh-B 
RNAi reduced the fission success ratio without altering the number 
or frequency of fission attempts (Fig. 4d, e, Extended Data Fig. 7k-n). 
Finally, apc RNAi reduced the time between fission attempts by approx- 
imately 50%, and worms initiated fission attempts independently of 
remaining tissue, markedly reducing their success ratio (Fig. 4e, 
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Fig. 2 | Wnt signalling and TGF signalling 
components modulate fission activity. 

a, RNAi screen workflow (see also Extended 
Data Fig. 3). b, c, Heat maps depicting fission 
activity after RNAi treatment for both the two- 
phase primary (b) and secondary (c) RNAi 
screens. Normalized cumulative fissions over 


time are displayed for individual worms from 
each RNAi condition (m = 10 worms for phase 
I, n = 12 worms for phase II and secondary 
screen). Targets in secondary screening 
(independently repeated three times) depicted 
in green (activators) and red (inhibitors). 

P values determined by two-way analysis of 
variance (ANOVA) interaction factor. Ctrl, 
control. d, Representative parent images on 
days 0 and 14 of the fission assay (n = 10-12, 
independently repeated 3 times). Scale bars, 
1mm. 


3x RNAi feedings 


apc 


wnt11-6 


Number of planes per length 


a 
—_ —_ —> —_ 
Recirculation RNAi Measure Record no. of Normalize and 
system feedings initial length fissions daily colour 
b Primary RNAi fission screening 
Ctrl1 actR-1 smad2/3 f-cat_dsh-B tsh evi wnt2-1 wntt1-4  fz-4 wnt11-5 notum  whnt-1 follistatin ap 
0 20 
; 5 
| 2 2 
D| w 65 = 
8) 8 3 13 
E|As = 
E 
2 
eS ie) 
Ctrl2 wnt11-6 activin smo Ah gli-1 wnt11-1/2 sufu Dsh-A pte wntt1-3 wntS sfrp-2. wnt11-2_ sfrp-1 
. 7 = == = J 20 
: ] kaw | | =P a | ; 5 
~ | 3 
o i 
g 8 
aS 9 Ba 
ce i 
E E 
ie} 
zZ 
c Secondary RNAi fission screening 
Ctrl 1 actR-1 smad2/3 B-cat dsh-B wnt11-6 Ctrl 2 
3 = 29 
i] 2 
3 Ss 
4 D 
5 2 
Qe aa 
&3 ‘8 
10 a 
12 E 
: 2 
AVG| 0 
P< 0.0001 P< 0.0001 P <0.0001 P< 0.0001 P =0.0088 P< 0.0001 P=0.004 
d Ctrl 1 actR-1 smad2/3 B-cat dsh-B wnt11-6 tsh apc Ctrl 2 
+ 
ne] 
c 
o 
oO 
rr 
g 
a 
ne} 
p= 
oO 
G 
a 
e 
a b Control -catenin 
Ventral side up 
Pre- compression Post-compression 
E 3 
2 6 
a ra 
© & 
2 = 
& & 
& ire 
f 
c d 23x RNAi feedings, 3x regeneration 
Control 
— PCC = 0.74 107 oe 
8 70.4678x - 0.002 0 Fissions 
Pe e . 
Re = 0.55 oo E g-| = Compression 
g 64 S 
& 3 
a 3B 64 
§ 44 a 
i 3 44 
rt ce) 
3 94 
s) a 2? A 
(¢) 0 


T T T ii i is {i 
1st 2nd 3nd 4th Sth 6th 7th 
Fragment 


T T 1 
5 10 15 


Length (mm) 
Fig. 3 | Pre-established fission planes determine progeny size 
independently of Wnt and TGF@ signalling. a, Schematic of compression 
assay (Supplementary Video 3). b, Pre- and post-compression 
worm (inset) and compression planes revealed in 3-6-mm worms 
(independently repeated 5 times). c, Compression plane number relative 
to worm length (n = 117 worms). PCC, linear regression (red lines), 
R? values and 95% confidence interval (black lines) are shown. d, Fission 
(n = 196 fission progeny from 50 worms) and compression plane (n = 173 
planes from 30 worms) overlap along the anterior—posterior axis of the 


worm. e, f, Representative images of post-compression worms after 
knockdown of Wnt and TGF@ signalling components using the specified 
number of RNAi feedings and rounds of regeneration (n depicted by dot 
plot quantification; experiment performed three times (e) or once (f)). 
Scale bars, 1 mm. g, Plot of the number of fission planes per worms length 
after RNAi treatment of 2 experiments (n = 20 and 10 worms to the 

left and right of the dotted line, respectively). P values determined 

by two-sided t-test. NS, not significant. Data are mean + s.d. (d) or 

mean + s.e.m. (g). 


29 AUGUST 2019 | VOL 572 | NATURE | 657 


LETTER 


VE 
oO" 7.9 
a b d a 
a 2 o 3 
- 2 
= ll | = | | 2.20 . “4 
8 rT é ! 5: Soke 2 
£15 Ss Ss: ' Ss 
‘ SS a ae 
RNAifeedings 0 2 4 6 8 0 2 4 6 8 Bl]: gf ~ oa 
: : z : Oe . 
Y pas Time (days) m Time (days) e : feed 2 @° = re H t| oe 
: a 
a ——— 2 i ee eae 
r 3 G SF K WD ww’ B Mow 
| : : o| : : a oe & 
0) 2 4 6 8 0 2 4 6 8 e 2 2 
2 © Ss 9 
N ~ 15 SS SS 
Live imaging g = 5 g yoy 9” 
< Ss be H Oo 
\ 5 = Siz ol ... x RS Ss H Sie 
Successful 0 2 4 6 8 0 2 4 6 8 8 : gS so H 
2] fission « ic 805] = ” Q. ar aor a5 
z Unsuccessful a o 2 % ra x + Fy lel H jal . 
fission 3 Q . iyi]. 
Time—> $ . 00 OF OA Df Vv 8 
at : I r r x e of ye N x 
Data visualization ie) 2 4 6 8 (0) 2 4 6 8 & yy & & se S&S & 
Time (days) Time (days) 2 
f g i i < 50 RNAI 
aT Small Medium Large 5. | =a.0004 Ctrl Ka TL-2 brg3L-2 
CNS in situ g 3 F . a each 3.0 » 
net = c 
staining g 2 
60 N g 2 
( h 1530 
= 2 8 
a © pkdtL-2* x E 
1 Angle = 002 
3 Width P<0.0001 P<0.0001 
RNAi 
a Ctrl pkd1L-2  gabrg3L-2 
s 
3 1310 ‘P< 0.009' P< 0.003 a ) 
5S id 0.9 | P= 0.0007 1 + 
oO = a 1 ‘ * 
da a -_ : : D 
g 0.8 P<0.02; +: +. 5 
5 : \ ° 
& 07 += ; : g 
h Ctrl 1 smad2/3 Ctrl 2 wnt11-6 “ =: = $ 
3 06 ; i ‘< 
lo} a os ; i= 
 gabrg3L-2+ Ss § 
2 Range & 05 a 
N Ss << & > © 
s . rf { 
a 3 Width aS s we 3 & eS & 
is] $ 
x o 95, P= 0.0031 L245 
Q _ p —_ q 2 NS 
rg | | | @ 20 £ P =0.1547 
6 3 S40; ¢ 
2 £ i : g : 
q i) 2 4 8 io) - gy Sos! ce 
sc) a Time (days) 3 . ; % Y. : 
a) h F 5 ae ® : - 
s x 2 3 : 
g 2 ty) g 0.0 
a 3 © © 
ion Ss & s & 
0 2 4 6 8 oO oxy o Ss 
Time (days) 9 9 


Fig. 4 | Wnt and TGF@ signalling regulates fission frequency by size- 
dependent patterning of mechanosensory neurons in the CNS. 

a, Schematic depicting RNAi treatment, live imaging and data analysis. 

b, c, Representative activity timelines. d, e, Total fission attempts (d) and 
successful attempts per total attempts (e) for RNAi-treated worms (n = 16 
(d) and 10 (e) worms, from 1 (d) or 2 (e) independent experiments). 

f, Schematic depicting in situ staining strategy. g, h, Representative images 
of pkd1L-2* and gabrg3L-2* neurons in worms of increasing size (g), or 
after RNAi treatment (h). Scale bars, 0.5 mm. i, k, Diagrams depicting 
quantification of the angle of pkd1L-2* cells (i) or the range of gabrg3L-2+ 


Extended Data Fig. 7i-n, Supplementary Video 6). These findings 
demonstrate that Wnt and TGF signalling regulate the frequency of 
fission behaviour. 

We proposed that components of the Wnt and TGF@ signalling 
pathways might regulate fission behaviour through the planarian 
CNS. Double fluorescent in situ hybridization (FISH) with the CNS 
marker pce2 confirmed that Wnt and TGF fission regulators were 
detected in pc2-positive cells in the anterior CNS (Extended Data 
Fig. 8a, b). Removal of anterior tissue that contains the cephalic 
ganglia delayed the onset of fission behaviour (Extended Data 
Fig. 8c-f). Restoration of fission activity coincided with regenera- 
tion and re-establishment of anterior, pc2 co-localized, tsh expression 
(Extended Data Fig. 8g). Notably, removal of anterior tissue that 
contained just one cephalic ganglion did not alter the total num- 
ber of fission progeny produced (Extended Data Fig. 8c-f), which 
indicates that half of the CNS is sufficient to initiate fission. Finally, 
RNAi against coe, a transcription factor essential for the patterning 
of the CNS2”?3, markedly reduced planarian fission (Extended Data 
Fig. 8h, i). Together, these data support a model in which an anterior 
CNS expressing Wnt and TGF6@ signalling components regulates 
fission initiation. 
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cells (k). j, 1, Staining quantification of pkd1L-2 (j) and gabrg3L-2 (1) in 
worms of increasing size or after RNAi treatment (n = 3-6, exact n depicted 
in dot plot quantification). m, Fission activity heat maps after treatment 
with pkd1L-2 and gabrg3L-2 RNAi (n = 12; Fig. 2). n, Representative 

parent images on days 0 and 14 of fission assay (n = 12, 2 independent 
experiments). Scale bars, 1 mm. 0, Representative fission activity timelines 
of worms treated with gabrg3L-2 RNAi. p, q, Total fission attempts (p) and 
successful attempts per total attempts (q) for worms treated with gabrg3L-2 
RNAi (n = 10 worms). P values determined by two-sided t-test (j, 1, p, q) or 
two-way ANOVA (d, e). Data are mean + s.e.m. (d, e, j, 1, p, q). 


We tested whether polarity genes could modulate size-dependent 
behaviour via size-dependent patterning of the CNS. To identify 
neuronal subpopulations that regulate fission downstream of Wnt 
and TGFQ, we analysed 17 neuronal markers”*”? in small, medium 
and large planaria and 10 markers in worms treated with smad2/3 
RNAi (Fig. 4f, Extended Data Fig. 9a, b). Patterning of pkd1L-2*, 
gabrg3L-2* and sargasso-1* mechanosensory neurons exhibited the 
clearest changes in worms of increasing size and after smad2/3 RNAi 
treatment (Extended Data Fig. 9a, b). In large worms, mechanosensory 
neurons are tightly restricted to the anterior and knockdown of either 
smad2/3 or wnt11-6 broadened their distribution akin to that of smaller 
worms (Fig. 4g-l). RNAi against pkd1L-2 and gabrg3L-2 (homologous 
to cation and chloride channel genes, respectively) increased planarian 
fission activity (Fig. 4m, n), and live imaging of gabrg3L-2 RNAi worms 
confirmed an increase in fission attempts without a reduction in fission 
success (Fig. 40, p, Extended Data Fig. 10, Supplementary Video 7). 
These results indicate that mechanosensory neurons are differentially 
patterned during growth, inhibit fission behaviour and require Wnt and 
TGES@ for their appropriate patterning in accordance with worm size. 
Therefore, we conclude that Wnt and TGF@ signalling coordinates worm 
size and behaviour via size-dependent patterning in the adult CNS. 


In conclusion, we used planaria as a model for the integration of size, 
patterning and function and established fission as a robust, reproduci- 
ble and quantifiable size-dependent behaviour (Fig. 1, Supplementary 
Video 1). Although previous studies have generated physical models 
for the process of transverse fission’, mechanisms that couple worm 
size and fission frequency have remained unknown. We discovered 
two independent mechanisms by which fission is coordinated with 
worm size in S. mediterranea. First, previously undescribed iterative 
structures patterned in accordance with anterior—posterior axis length 
couple worm size with the number of fission progeny produced (Fig. 3, 
Supplementary Video 3). Second, the Wnt and TGF® signalling path- 
ways mediate size-dependent patterning of mechanosensory neurons, 
which regulate fission frequency (Fig. 4, Extended Data Figs. 9, 10). 
Thus, we demonstrate that differential patterning of key cell popula- 
tions in accordance with tissue size provides a mechanistic link between 
worm growth and the acquisition or modulation of tissue function. 
Together, our results identify a role for Wnt and TGF® patterning genes 
in the regulation of size-dependent behaviour and show that develop- 
mental patterning cues coordinate tissue growth with size-dependent 
functions. 
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METHODS 


Worm husbandry. Clonal CIW4 S. mediterranea were maintained in 1 x Montjuic 
salts as previously described. CIW4 worms were sourced from a large recirculation 
culture as previously reported". In brief, worms are housed in three culture trays 
(244 cm length x 61 cm width x 30.5 cm height) stacked vertically. Water is 
recirculated through the system by a sump pump, which moves water through a 
chiller, a canister filter, a UV sterilizer and the three housing trays. Water is then 
passed through two vertically stacked sieves and a set of filter/floss pads before 
being returned to the sump pump. Worms were pulled from this system and placed 
directly into fission assays, starved for at least seven days before tissue fixation 
for imaging, or transferred to a unidirectional flow system culture for controlled 
feeding or RNAi feeding experiments. 

Gene cloning and RNAi feeding protocol. Candidate genes analysed in this study 
were cloned from a CIW4 cDNA library into a pPR-T4P vector as previously 
described” (Supplementary Table 1). These served as template for in vitro synthesis 
of dsRNA for RNAi feedings. Unc22 dsRNA was used for control RNA treatment. 
RNAi food was prepared by mixing 1 volume of dsRNA at 1,600 ng ml“! with 
1.5 volumes of beef liver paste. For RNAi experiments that target neuronal genes, 
1 volume of dsRNA at 1,400 ng jl~! was mixed with 1 volume beef liver paste. The 
amount of food administered was 10 11 of food per 1 mm of worm length present 
in the worm flow container. Worms were allowed to feed for 6-10 h with 2 rounds 
of light stimulation to facilitate additional consumption. Worms were fed every 
three days for a total of three RNAi feedings, unless otherwise specified. After RNAi 
feedings, worms were transferred to the relevant biological assay. 

Fission assay. A detailed protocol for fission induction has been made availa- 
ble through Protocol Exchange!®. To induce fission, worms were removed from 
recirculation culture or unidirectional flow system culture and washed 5-10 times 
with fresh 1 x Montjuic salts. Individual worms were placed in 15-cm tissue culture 
dishes with 50 ml 1x Montjuic salts and their body length was measured. 
Representative images of day-0 parents were captured using a Leica M205 micro- 
scope. Plates were stacked 6-12 dishes high and placed in a dark incubator at 20°C. 
Daily, plates were removed from the incubator and fission fragments for each 
worm were counted and removed from the 15-cm dish. For some experiments, 
images of fission fragments were taken on the day they were collected to allow for 
quantification of fission fragment length. The 1 x Montjuic salts in each individual 
dish was replaced weekly. 

For data analysis, the number of daily cumulative fissions was divided by initial 
body length and then normalized to the average of the control RNAi fissions. This 
normalized fission score for each day was converted to a heat colour code. Daily 
scores for each individual worm were aligned in descending order along the y axis 
and the average score of each column was calculated and used to sort individual 
worms in ascending order along the x axis. The average fission score of each RNAi 
condition was then sorted in ascending order from left to right. This resulted in a 
heat map visualization ranking the effects of RNAi treatments on fission activity. 
Fission plane compression assay. Fission planes were revealed by compression 
between a plastic tissue culture dish and a glass coverslip (Supplementary Video 3). 
Worms were inverted with their ventral side up, compressed using four fingertips, 
then imaged. To ensure that all compression/fission planes were revealed for every 
worm, images were acquired sequentially using a Leica M205 microscope as each 
fission plane was revealed by mechanical compression. Position of fission planes 
and distance between fission planes was quantified using Fiji (https://fiji.sc/). Video 
depicting compression assay was captured with an iPhone 6 (Apple). 
Whole-mount FISH. For RNA expression analyses, FISH was performed as 
previously described*”*!. Antibodies were used in MABT containing 5% horse 
serum for FISH (Roche anti-DIG-POD 1:1,000 and Roche anti-FLCN-POD 
1:1,000) or NBT/BCIP in situ hybridization (Roche anti-DIG-AP 1:1,000). For 
double FISH, peroxidase activity was quenched between tyramide reactions using 
100 mM sodium azide for at least 1 h at room temperature with agitation. Nuclear 
staining was performed using 1:1,000 Hoescht 33342 (Invitrogen) in PBST (1x 
PBS with 0.5% Triton-X-100). 

Microscopy. Images of live worms and regenerating fission fragments were acquired 
using a Leica M205 microscope. Confocal images were acquired on an LSM-700-Vis 
and stitching was performed in Fiji using built-in grid collection plugins. 

Live imaging of fission behaviour. Videos of worms from two orthogonal views 
were acquired using two webcams (Logitech C910/920). Webcams were mounted 
using a variety of ring stands and test tube clamps. The imaging chamber was a 
clear plastic square lid obtained from a box of coverslips. Lighting of the chamber 
was achieved using a Volpi illuminator (NCL-150). Each camera was connected to 
its own computer running micro-manager (https://micro-manager.org/). The cam- 
eras were set up in micro-manager using OpenCV grabber to set the pixel density 
(1,920 x 1,080) and to acquire the images. The camera gain, exposure and all other 
settings were set using the Logitech Webcam Controller software (https://down- 
load01.logi.com/web/ftp/pub/video/lws/lws280.exe). Data were acquired using 


the Multi-Dimensional Acquisition mode of micro-manager. The two computers 
were synchronized for acquisition manually at the beginning of the experiment. 
For the high-throughput screening of fission behaviour, worms were placed in 
six-well dishes with cameras mounted above the plates using optics components 
(Thor Labs). Illumination was obtained using four LED ring lights (AmScope) 
mounted upside down and above the cameras to provide diffuse light. Image acqui- 
sition was performed using two different camera configurations: four cameras 
connected to one computer via a USB hub or one 4K camera connected to a USB 
port. In the four-camera configuration, images where captured sequentially from 
the cameras every ten minutes. A script written in Python 3.6 (https://www.python. 
org/) was used as a wrapper for FFMPEG (https://www.ffmpeg.org/) to acquire 
images. The size of the images (1,920 x 1,080) and the pixel format (yuv420p) 
were set in the python script. The camera gain, exposure and other settings were 
controlled with the Logitech Webcam Controller software (https://download01. 
logi.com/web/ftp/pub/video/lws/lws280.exe). The DirectShow framework was 
used to interface between the cameras and FFMPEG. In the single 4k camera 
setup, a 4096 x 2160-pixel image was captured every ten minutes from a Logitech 
BRIO webcam. The same Python script was used as a wrapper for FFMPEG in 
this configuration. 
Quantification of live imaging. Videos of individual worms were manually anno- 
tated. For each fission attempt, the start time and completion time were recorded 
and the success or failure of the attempt was recorded. To depict fission behaviour, a 
timeline was constructed and a numerical value was given to each frame of a video. 
A value of 0 was assigned to any frame in which no fission behaviour was observed; 
a positive value was given to any frame during a successful fission attempt; and a 
negative value was given to any frame during a failed fission attempt (see Fig. 1f). 
A prolonged diagonal line in a timeline indicates a period in which frames were not 
acquired owing to failed communication between the image acquisition software 
and the webcam. 
Statistical tests. For all pairwise comparisons, significance was tested using an 
unpaired Student's t-test. GraphPad Prism was used to calculate PCC values with 
a two-tailed 95% confidence interval and to perform linear regression analyses. 
Two-way ANOVA analysis was performed in GraphPad Prism to determine the 
significance of RNAi treatment over time. No statistical methods were used to 
predetermine sample size. The experiments were not randomized, and investiga- 
tors were not blinded to allocation during experiments and outcome assessment. 
Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 

Source data and construct sequences can be accessed from the Stowers Original 
Data Repository at http://www.stowers.org/research/publications/libpb-1356. All 
other data are available from the corresponding author upon reasonable request. 


Code availability 

Code for the Python 3.6 (https://www.python.org/) script used for a wrapper for 
FEMPEG (https://www.ffmpeg.org/) for the high-throughput recording of fission 
behaviour is available at the Stowers Original Data Repository at http://www.stow- 
ers.org/research/publications/libpb-1356. 
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Extended Data Fig. 1 | See next page for caption. 
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Extended Data Fig. 1 | Characterization of planarian fission biology. 
a, Live imaging of large planarian worm during fission (representative 
of 12 experiments; see also Supplementary Video 1). b, Imaging of 
single individual large planarian and regenerating progeny 0, 4, 8 

and 12 days after fission induction (experiment repeated 50 times). 

c, d, Anterior—posterior length of progeny (c) and time to fission event 
(d) since induction or the previous fission (” = 50 worms). Fission 
fragments binned by position along the anterior—posterior axis 

(the first fission is the most posterior). e, Schematic of fission induction 
and quantitative scoring system used to compare fission activity between 
different conditions. f, Cumulative fission fragments produced over 


14 days by individual worms binned by parent size (n = 10 per bin). 

g, h, Time to first fission event (g) or time between sequential fission 
events (h) for worms 6-8 mm, 9-12 mm or 13-17 mm in length. i, Raw 
parent length measurement of planarian individuals 6-8 mm, 9-12 mm 
or 13-17 mm in length. j, Time between first and second fission events for 
worms 6-8 mm, 9-12 mm or 13-17 mm in length (n = 139 independent 
measurements from 30 worms). k, 1, Time between induction and first 
fission (k) or between first and second fission (1) plotted relative to parent 
length (n = 26 and 21 independent measurements from 30 worms). PCC, 
linear regression and R? values are provided. P values determined by 
determined by two-sided t-test. Data are mean + s.e.m. (c, d, j). 
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Extended Data Fig. 3 | Strategy for a targeted RNAi screen to a heat colour code. Daily scores for each individual worm were aligned 
identify regulators of fission. a, Detailed schematic of RNAi workflow. in ascending order along the y axis. The average score of each column is 
Worms are grown to an optimal size in the recirculation culture system calculated and used to sort individual worms in ascending order along the 
and transferred to a flow system for RNAi feedings. After 3 RNAi x axis. The average fission score of each RNAi condition was then sorted 
feedings, worms were transferred to a 15-cm dish and worm length was in ascending order from left to right. The result is a heat-map visualization 
recorded. The number of fissions were recorded daily for 14 days for that ranks the effects of RNAi treatments on fission activity. b, Wnt, TGF6 
each worm from each RNAi condition. For data analysis, the number and Hh signalling pathway diagrams focusing on components targeted for 
of daily cumulative fissions were divided by initial body length and the RNAi screen. Green arrows indicate positive interactions; red arrows 
then normalized to the average of the control RNAi fissions. For data indicate inhibitory interactions. 


visualization, this normalized fission score for each day was converted to 
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Extended Data Fig. 4 | Analysis of morphology and/or internal tissues 
in regenerating fragments and fissioning parents. a, Representative 
images of regenerating tissue fragments from different positions along 

the anterior—posterior axis at 15 days post-amputation (dpa). Fraction of 
worms with pictured phenotype along with 1-mm scale bar depicted below 
each image. b, c, In situ staining of CNS (pc2), intestine (porc) and muscle 


LETTER 


wnt11-6 Control#2 


) | 


10/10 


smad2/3 DshB 


APC _ fB-catenin 


DshB RNAi smad2/3 RNAi B-catenin RNAi 


Ctl RNAi 


2 
z 
a 
g 
c 
s} 
° 
Qh 


smad2/3 RNAi 


(t-mus) tissues at day 15 of regeneration (b) or the fission assay (c). 

d, High-resolution image of body wall musculature (t-mus) in control 
RNAi and smad2/3 or (3-catenin RNAi treated worms. Representative 
images (n = 7-13 worms) from a single experiment. All images are 
oriented ventral side up with anterior on the left side. Scale bars, 0.5 mm. 
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Extended Data Fig. 5 | See next page for caption. 


Extended Data Fig. 5 | Effects of growth, starvation and regeneration 
on fission planes. a, Image of planaria after incomplete fission, revealing 
ventral tear identical to compression planes (observed more than five 
independent times). b, Post-compression worms at 5, 18 and 30 days post- 
fertilization (dpf) (5 dpf image from same experiment as Fig. 3b). Data are 
from a single experiment. c, d, Bidirectional plot of compression planes 
versus worm length (n = 25 worms) (c), and relative distribution of planes 
(d) at 5, 18 or 30 dpf (nm = 28, 18, 31, 15 and 19 worms (left to right in d)). 
e, Schematic of experiment tracking establishment of fission planes during 
tissue regeneration. f, g, Representative images (f) and bidirectional plot 
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of compression planes versus worm length (g) after amputation (1 dpa, 
n= 15 worms), regeneration (8 dpa, n = 19 worms) and growth 

(fed 14 dpa and 25 dpa, n = 12 and 32 worms) or de-growth (starved 

25 dpa, n = 15 worms). Data from a single experiment. h-j, Worm length 
(h), number of compression planes (i) and relative distribution of planes 
(j) after amputation (1 dpa, n = 15 worms), regeneration (8 dpa, n = 19 
worms) and growth (fed 14 dpa and 25 dpa, n = 12 and 32 worms) or 
de-growth (starved 25 dpa, n = 15 worms). Data are mean + s.d. (c, g) or 
mean + s.e.m (h-j). 
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Extended Data Fig. 6 | Effects of RNAi of Wnt and TGF6 signalling or smad2/3 RNAi (experiment independently performed twice). Scale 
components on fission planes. a, b, Relative plane distribution after bar, 1 mm. d, e, Length of the first fission progeny (d) or all subsequent 
RNAi treatment (nm = 20 (a) and 10 (b) worms). c, Representative images progeny (e) in worms treated with control, 3-catenin, actR-1 or smad2/3 
of progeny within 24 h of fission and of remaining parent tissue at day 28 RNAi (n = 85 fission fragments from 36 worms). P values determined by 
after fission induction for worms treated with control, 3-catenin, actR-1 two-way ANOVA interaction factor (a, b) or two-sided t-test (d, e). Data 


are mean + s.e.m. 
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Extended Data Fig. 7 | Wnt and TGF@ signalling components regulate 
the frequency of fission initiation. a—h, All individual timelines 
depicting fission activity over 9-10 days for worms treated with control 
(a, g), actR-1 (b), smad2/3 (c), 3-catenin (d), dsh-B (e), apc (f) or wnt11-6 
(h) RNAi. i-n, Graphs depicting the time between sequential fission 
attempts (i, j), the number of successful fission attempts (k, 1) and the 
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number of unsuccessful fission attempts (m, n) in worms fed double- 
stranded RNA (dsRNA) that targets regulators of fission (n = 421 fission 
events from 116 worms). Worms were given either 3 (a-f, i, k, m) or 18 
(g, h, j, 1, n) dsRNA feedings. Batched experiments are plotted separately. 
P values determined by two-sided t-test. Data are mean + s.e.m. 
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Extended Data Fig. 8 | See next page for caption. 


Extended Data Fig. 8 | The planarian anterior CNS regulates fission. 

a, Whole-brain imaging of pc2 and fission regulator gene expression 
detected by double FISH (n = 2-4 worms; experiment independently 
repeated). Scale bars, 100 um. b, Single-cell co-expression of pc2 and 
fission regulators in the posterior branches of the anterior CNS 

(n = 3-5 worms). Scale bar, 50 jum. c, Fission induction in intact, 100% 
head-amputated or 50% head-amputated worms over a 9-day observation 
period (n = 12 worms). d-f, Total number of fission progeny over 9 days 
(d), the time between fission induction and first fission (e), and the time 
between first and second fission (f) for intact, 100% head-amputated 


LETTER 


or 50% head-amputated worms (n = 94 fission events from 36 worms). 

g, Regeneration time course in 100% head-amputated worms showing 
recovery of anterior gene expression of pc2 co-localized with teashirt 

(n = 4-5 worms; experiment performed once). Scale bar, 500 j1m. h, Heat 
maps depicting fission activity after treatment with coe RNAi. Normalized 
cumulative fissions over time are displayed for individual worms from 
each RNAi condition (m = 12 worms). i, Representative parent images 

on days 0 and 14 of the fission assay (n = 12, experiment independently 
performed twice). Scale bars, 1 mm. P value determined by two-sided 
t-test (d-f) or two-way ANOVA (h). Data are mean + s.e.m. (d-f). 
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Extended Data Fig. 9 | Comparison of neuronal subpopulations in and large worms (n = 3-5 worms; 1 experiment). b, Representative images 
worms of increasing size and after smad2/3 RNAi treatment. of a subset of neuronal markers analysed in worms treated with smad2/3 
a, Representative images of neuronal marker staining in small, medium RNAi (n = 3-5 worms; | experiment). Scale bars, 0.5 mm. 
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Extended Data Fig. 10 | gabrg3L-2 negatively regulates the frequency of fission initiation. a, b, All individual timelines depicting fission activity 
over 9 days for worms treated with control (a) or gabrg3L-2 (b) RNAi (n = recordings of 10 worms combined from 2 independent experiments). 
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Reconstituting the transcriptome and DNA 
methylome landscapes of human implantation 


Fan Zhou!*’, Rui Wang!*, Peng Yuan)’, Yixin Ren!*’, Yunuo Mao!*’, Rong Lib?, Ying Lian!*, Junsheng Li!?, Lu Wen!?, 


Liying Yan!?34, Jie Qiaob?3:45.%* & Fuchou Tangh?:35:6# 


Implantation is a milestone event during mammalian 
embryogenesis. Implantation failure is a considerable cause 
of early pregnancy loss in humans!. Owing to the difficulty of 
obtaining human embryos early after implantation in vivo, it 
remains unclear how the gene regulatory network and epigenetic 
mechanisms control the implantation process. Here, by combining 
an in vitro culture system for the development human embryos after 
implantation and single-cell multi-omics sequencing technologies, 
more than 8,000 individual cells from 65 human peri-implantation 
embryos were systematically analysed. Unsupervised dimensionality 
reduction and clustering algorithms of the transcriptome data show 
stepwise implantation routes for the epiblast, primitive endoderm 
and trophectoderm lineages, suggesting robust preparation for 
the proper establishment of a mother-to-offspring connection 
during implantation. Female embryos showed initiation of 
random X chromosome inactivation based on analysis of parental 


allele-specific expression of X-chromosome-linked genes during 
implantation. Notably, using single-cell triple omics sequencing 
analysis, the re-methylation of the genome in cells from the primitive 
endoderm lineage was shown to be much slower than in cells of 
both epiblast and trophectoderm lineages during the implantation 
process, which indicates that there are distinct re-establishment 
features in the DNA methylome of the epiblast and primitive 
endoderm—even though both lineages are derived from the inner 
cell mass. Collectively, our work provides insights into the complex 
molecular mechanisms that regulate the implantation of human 
embryos, and helps to advance future efforts to understanding early 
embryonic development and reproductive medicine. 

Human embryonic development starts from a fertilized egg, after 
which a free-floating blastocyst is formed, which consists of an outer 
trophectoderm (TE) and an inner cell mass. The mature inner cell 
mass is composed of the pluripotent epiblast (EPI) covered by a layer 
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Fig. 1 | Single-cell RNA-sequencing transcriptome profiling of human 
post-implantation embryos. a, Schematic illustration of the strategy for 
the collection of single cells, and transcriptome and DNA methylome 
analyses used in this study. DEGs, differentially expressed genes; scBS-seq, 
single-cell bisulfite sequencing; sCRNA-seq, single-cell RNA sequencing; 
STRT-seq, single-cell tagged reverse transcription sequencing; TF, 
transcription factor. b, Representative bright-field images of human 

in vitro cultured embryos until day 14. The blastocysts clearly attached 

to the bottom of the culture plate and formed a ring-like structure, with 
the EPI cells surrounded by TE cells, at approximately days 7-8. ICM, 


inner cell mass. All scale bars, 100 jm. All embryos used in this study are 
described in Supplementary Table 1. c, d, Cells from embryos containing 
all three major lineages at four representative stages were projected on 

to the t-SNE map, enabling the identification of the developmental path 
and cell lineage. Cells were identified as EPI, PE, TE and ysTE cells. 

Cells (dots) are coloured according to embryonic stages (c) and original 
lineage identities (d). c, d, In total, 3,184 single cells were included. c, Day 
6, n = 387 cells; day 8, n = 1,525 cells; day 10, n = 1,021 cells; day 12, 

n= 251 cells. d, EPI, n = 282 cells; PE, n = 138 cells; TE, n = 2,725 cells; 
ysTE, n = 39 cells. 
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Fig. 2 | Transcriptome dynamics at post-implantation stages. 

a, The expression patterns of lineage signature genes. Among these genes, 
several genes were recognized as classical lineage-specific maker genes, 

such as NANOG for EPI, GATA4 for PE and TFAP2A for TE. A marker that 
has been identified in the yolk sac of mice, APOA4, was identified as one 

of the signature markers in the PE lineage*°. ACKR2 was identified as a TE 
marker here, consistent with its known function for regulating placenta 
development in mice. The ysTE specifically expressed CD44, which has 
been reported as a critical gene for trophoblast cell invasion. On the basis 

of the signatures of each lineage, the EPI signatures were enriched in Gene 
Ontology (GO) terms related to embryonic morphogenesis, gastrulation and 
transcriptional regulation of pluripotent stem cells. The PE signatures were 
enriched in GO terms related to cell-fate specification, embryonic organ 
development and epithelial cell differentiation, and the TE signatures were 
enriched in response to steroid hormone, extracellular matrix organization 
and vasculature development (Supplementary Table 6). Such changes among 
lineages are probably to balance pluripotency maintenance/transition and 


of primitive endoderm (PE; also known as the hypoblast). The EPI, 
PE and TE then give rise to the embryo proper, yolk sac and placenta, 
respectively. By embryonic days 6-7 (after fertilization), the embryo 
will implant into the uterus to form a gastrula, followed by organogen- 
esis. Owing to the limited access to embryos early after implantation 
in vivo, the lineage specification and corresponding patterns of the 
transcriptome and DNA methylome that are specific to the different 
lineages during human implantation are still poorly understood. In this 
study, with the help of donated human embryos and a robust in vitro 
culture system for post-implantation embryos and single-cell multi- 
omics sequencing technologies”-°, we simultaneously analysed the 
gene-expression network and lineage-specific DNA methylation pat- 
terns of the human peri-implantation embryos at single-cell resolution. 

First, we mimicked the implantation of human embryos as 
previously reported”* (Fig. 1a, b, Extended Data Fig. la-d and 
Supplementary Videos 1-5). The culture of embryos was terminated 
at day 14 according to bioethical guidelines®’. Next, we profiled 7,636 
individual cells from 48 human pre/post-implantation embryos at 
5 consecutive developmental stages, including the blastocyst stage 
(day 6, pre-implantation) and 4 later stages (days 8, 10, 12 and day14 
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OTX2 GATA6G OCT4 DAPI Merged 


lineage specification for different developmental potentials to support the 
continued development of the embryo after implantation. b, Morphological 
visualization of OTX2 expression in PE cells (n = 3). Series of confocal z- 
sections of the embryo stained for OCT4 (red), GATA6 (green) and OTX2 
(cyan). Arrowheads, GATA6* but OTX2™ cells. Arrows, GATA6t and 
OTX2t cells. All scale bars, 20 jum. c-e, Stage-specific gene-expression 
patterns for three major lineages. There were 67 (EPI), 224 (PE) and 282 
(TE) genes that showed stage-specific expression features in each lineage 
(c-e; Supplementary Table 6). GO analysis showed that EPI stage-specific 
genes were clearly enriched in embryonic morphogenesis genes (such 

as SALL1 and HAND1) between day 6 and day 12. Additionally, PSG2, a 
member of the pregnancy-specific glycoprotein (PSG) gene family, was 
upregulated from day 8 to day 10 in the TE lineage, which indicates that 
the embryo might be preparing for mother-fetal interactions during 
implantation. The colours from blue to red represent the expression levels 
from low to high. 


after implantation) (Fig. 1c and Supplementary Tables 1, 2). In total, 
5,911 single cells were retained for subsequent analyses following strin- 
gent filtering (Extended Data Fig. le-h). Unsupervised t-distributed 
stochastic neighbour embedding (t-SNE) analysis revealed that all of 
the cells were grouped by their developmental states (Extended Data 
Fig. li). Furthermore, t-SNE analyses partitioned the cells into four 
main clusters. The analysis of the expression of known markers and 
lineage scores identified these clusters as the EPI, PE, TE and yolk-sac 
trophectoderm (ysTE)**? (Extended Data Figs. 1), k, 2, Supplementary 
Methods and Supplementary Table 3). As there was only a limited 
number (only 39 cells in total) of ysTE cells and to avoid the poten- 
tial influence of embryos that lacked lineage(s) on the outcome of the 
analysis, we next focused mainly on the features of the three major 
lineages in those embryos that contained all three of the major lineages 
(EPI, PE and TE); 3,184 individual cells were retained in the subsequent 
analyses (Fig. 1c, d and Extended Data Fig. 1g). 

Next, we identified genes that were specifically expressed in the 
EPI, PE and TE cells, and defined these as lineage signature genes. 
In addition to the canonical lineage makers, the EPI expressed 
KHDC3L, TDGF1, CXCL12 and THY1, which have previously been 
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Fig. 3 | CNVs and unsynchronized X chromosome inactivation among 
different lineages during implantation. a, CNV information projected 
onto the t-SNE plot based on the transcription-factor regulatory network. 
In total, 3,184 single cells were included; 1,027 cells were aneuploid and 
2,157 cells were euploid. b, The ratio of total expression levels of genes 
located on the X chromosome (Chr.X) and the same number of genes 
located on autosomes (A). In total, 3,184 cells were included; male, 
n = 1,016 cells; female, n = 2,168 cells (Supplementary Table 1). Black 
lines indicate median values, the boxes range from the 25th to 75th 
percentiles and the whiskers correspond to 1.5x the interquartile range 
(IQR; the distance between the first and third quantiles). c, The proportion 
of bi-allelic expression of chromosome-X-linked genes compared to the 
expression of autosome-linked genes (Supplementary Methods). In total, 
150 cells were included; day 6, n = 24 cells; day 8, n = 30 cells; day 10, 
n= 70 cells; day 12, n = 26 cells. Black lines indicate median values, the 
boxes range from the 25th to 75th percentiles and the whiskers correspond 
to 1.5x the IQR. d, The percentage differences between maternal and 
paternal alleles (maternal allele (%) — paternal allele (%)) for each cell. 
According to the heterozygous single-nucleotide polymorphisms, reads 
were traced to their parental origins. For each chromosome, the ratio of 
reads from paternal or maternal alleles was calculated (Supplementary 
Methods). In total, 150 cells were included. For each developmental stage, 
the mean value for maternal or paternal alleles (maternal (y > 0), paternal 
(y < 0)) were calculated for chromosomes X and 1. Mean values are shown 
as pink and blue centre lines for maternal and paternal alleles, respectively. 


reported to be signatures of EPI during the implantation process in 
monkeys?” (Fig. 2a and Supplementary Tables 4-6). Immunostaining 
showed that OTX2, a pluripotency marker for EPI development during 
implantation", was expressed in a fraction of GATA6-expressing PE 
cells, but not in OCT4-expressing EPI cells, verifying our single-cell 
RNA-sequencing data (Fig. 2b and Extended Data Fig. 2d). Human 
and monkey embryos shared partial signatures of these three lineages 
and related derivatives’, although human embryos also expressed 
several unique signature genes, such as UTF1 in EPI, GPC3 in PE and 
CYP19A1 in TE (Extended Data Fig. 3). 

Principal component analysis and pseudo-time analysis 
revealed that all three lineages presented their own developmental con- 
tinuity, suggesting that there are stepwise implantation routes (Extended 
Data Fig. 4a, b). Further analysis of stage-specific gene-expression 
patterns for each lineage indicated that the embryo started preparing for 
mother-fetal interactions during implantation (for example, the expres- 
sion of embryonic morphogenesis and pregnancy-associated genes; 
Fig. 2c-e and Extended Data Fig. 4c-f). Notably, the t-SNE results 
showed that TE gradually formed two separate subgroups around days 
10-12. The differential expression of HCGB family genes corresponded 
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to the subgrouping of TE by days 12-14 (Extended Data Fig. 4g). 
We determined that these subgroups consisted of cytotrophoblasts 
(which specifically expressed ITGA6) and syncytiotrophoblasts (which 
specifically expressed CGB family genes)*!*. Cytotrophoblasts also 
expressed regulators of the TE and placenta, such as FABP5 and FGFR1, 
whereas syncytiotrophoblasts expressed hormone-related genes of the 
placenta (for example, PSG3 and PSG6) and a number of newly iden- 
tified genes (for example, TCL6 and TBX3) (Extended Data Fig. 5). 
Furthermore, we found that both meiosis- and mitosis-derived copy 
number variations (CNVs) were widely present in the cultured embryos 
during implantation'*"4 (Extended Data Fig. 6 and Supplementary 
Methods). These aneuploid cells still clustered with the corresponding 
euploid cells in the t-SNE map, which suggests that the differentiation 
of the major lineages was generally not distorted by mild CNVs at the 
early stage of implantation (Fig. 3a). 

Inactivation of the X chromosome is important for the dosage 
balance of X-linked genes between females (XX) and males (XY), 
whereas upregulation of the expression of genes on the X chromosome 
is critical for the dosage balance between X-linked genes and autosomal 
genes'>-!?, The expression of X-linked genes should be equivalent to 
that of autosomal genes, which is achieved through upregulation of 
genes on the X chromosome in both male and female cells”®. Further 
measurements showed that the ratio of X chromosome to autosomes 
in male cells was near 2:2 (but not 1:2), indicating that the upregulation 
of the X chromosome had already started and the expression levels 
of genes on the only copy of the X chromosome in a male cell had 
already become comparable to those on two copies of the autosomes. 
Comparatively, the ratio of X chromosomes to autosomes in female 
cells appeared to be above 1 and slightly higher than the ratio found in 
male cells during implantation”! (Fig. 3b and Extended Data Fig. 7). 
We therefore suggest that both the upregulation and inactivation of 
the X chromosome had started, but was not fully completed in female 
cells at day 12. That is, one of the X chromosomes had randomly been 
inactivated and expression of genes on this X chromosome had been 
downregulated, whereas the expression of genes on the other, active, 
X chromosome had increased by nearly twofold. Subsequently, 238 
representative single cells for all three major lineages were selected for 
full-length cDNA high-depth sequencing to separate parental alleles 
within each individual cell (Supplementary Tables 1, 7). The genomes 
of the parental donors were also sequenced to call single-nucleotide 
polymorphisms (Supplementary Methods). At day 6, the majority of 
the cells still expressed X-linked genes in a balanced way from pater- 
nal and maternal alleles. However, at later stages many cells showed 
gradual accumulation of paternal- or maternal-allele-biased expression 
patterns, clearly indicating the initiation of random X chromosome 
inactivation during the early implantation period (Fig. 3c, d). 

DNA methylation has critical roles in the epigenetic regulation 
of the development of mammalian embryos. Nevertheless, the 
lineage-specific DNA methylation dynamics around implantation 
remain largely unknown. Using the single-cell Trio-seq2 strategy” and 
another round of embryonic culture followed by the collection of single 
cells, 2,544 individual cells from 17 embryos were first analysed for 
lineage identification (Extended Data Fig. 8, Supplementary Methods 
and Supplementary Tables 8-10). We subsequently selected 371 
lineage-specific individual cells for post-bisulfite adaptor-tagging DNA 
methylome sequencing (Extended Data Fig. 8b-d). Furthermore, 130 
euploid cells were retained for subsequent analyses after removal of 
aneuploid cells (Extended Data Fig. 9). Principal component analysis 
of DNA methylome data showed that these 130 cells formed 4 major 
clusters (Extended Data Fig. 8g, h), with a combination of the EPI, PE 
and TE at the blastocyst stage (day 6) as a single cluster, and the EPI, 
PE and TE beyond the blastocyst stage as another 3 separate clusters, 
suggesting that all of the 3 lineages showed considerable changes in 
DNA methylation soon after implantation. 

Next, we explored the dynamics of DNA methylation of each lineage. 
In general, EPI, PE and TE experienced strong genome re-methylation 
during implantation (Fig. 4a and Extended Data Fig. 10a). The median 
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Fig. 4 | Lineage-specific dynamics of the DNA methylome in human 
peri-implantation embryos. a, The global DNA methylation levels 
across different developmental stages for each lineage. b, Distribution 

of the relative enrichment of increased methylation tiles from day 6 to 
day 8 at various genomic features, 1,512,276 (EPI), 310,255 (PE) and 
1,140,524 (TE) tiles (300 bp) were de novo-methylated in each lineage. 
The re-methylated genomic regions were strongly enriched in enhancers 
and mammalian-wide interspersed repetitive (MIR) regions but depleted 
in the promoters and CpG islands (CGI) in all lineages. The EPI and 

PE lineages shared re-methylation enrichment of long interspersed 
nuclear element-2 (LINE-2) retroelements and intergenic regions. For 
PE, re-methylated genomic regions were strongly enriched for several 
families of repeat elements (for example, alpha satellites (ALR) and long 
terminal repeats (LTR)). For TE, re-methylated genomic regions were 
strongly enriched for intragenic regions—both exons and introns. This 
showed that DNA re-methylation has strong genomic-element-specific 
and clear developmental-lineage-specific features during implantation. 
Alu, Alu element; ERVs, endogenous retroviruses (ERVK, ERVL); HCP, 
high CpG density promoter; ICP, intermediate CpG density promoter; 
LCP, low CpG density promoter; MaLR, mammalian LTR; SINE, short 
interspersed nuclear elements; SVA, short interspersed element-variable 
number tandem repeat—Alu. c, The average DNA methylation levels for 
differentially methylated genes at promoter regions (transcription 

start sites included the upstream 250 bp and downstream 250 bp). 

d, DNA methylation levels of representative loci for lineage-specific genes 
at promoter regions. Each column represents one read. Red represents 
methylated CpG sites, blue represents unmethylated CpG sites and white 
represents undetected sites. Only reads that covered at least five CpG sites 
are shown in the heat map. 


DNA methylation level of EPI markedly increased from 26.1% at day 
6 to 60.0% at day 10. The median DNA methylation level of TE also 
increased from 23.5% at day 6 to 46.3% at day 10, an increase that is 
not as strong as that with the EPI (Fig. 4a). Notably, although the DNA 
methylation level of PE (27.0%) at day 6 is comparable to that of EPI 
(26.1%), it increased only to 33.2% at day 8 (compared to 49.9% in 
EPI), and 36.8% at day 10 (compared to 60.0% in EPI). The PE unex- 
pectedly revealed much slower DNA re-methylation dynamics during 
implantation compared to the EPI, even though both cell lineages are 
derived from the inner cell mass. These results suggest that the embryos 
initiated considerable re-methylation of the DNA shortly after the blas- 
tocyst stage, and that the three major lineages not only had different 
gene-expression signatures but also showed markedly distinct and 
asynchronous DNA re-methylation patterns. 

The patterns of hypermethylated gene body and hypomethylated pro- 
moter regions were shared in all of the three lineages (Extended Data 
Fig. 8i), which is similar to the patterns in pre-implantation embryos”. 
Further analysis of re-methylated genomic regions in all three lineages 
showed that DNA re-methylation has clear genomic element-specific 
and cell-lineage-specific features during implantation (Fig. 4b). 

We found that 241 genes showed increased levels of methylation at 
their promoter regions, potentially silencing their expression during 
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implantation (Extended Data Fig. 10b), and each lineage carried spe- 
cifically methylated genes (Fig. 4c). For instance, the promoters of 
POUSF1 (also known as OCT4) and NANOG, master regulators of 
pluripotency, were specifically methylated in TE cells, but not in EPI 
and PE cells at day 8 (Fig. 4c). By contrast, the promoters of devel- 
opment genes in the TE?34, such as MMP26, PSG7 and ELF5, were 
specifically methylated in EPI and PE cells but not in TE cells at day 8 
(Fig. 4c and Extended Data Fig. 10c). Similarly, the promoters of PE 
markers”>** such as APOAI and CPN1 were specifically methylated in 
EPI and TE cells but not in PE cells at day 8 (Fig. 4c, d). These findings 
indicate that DNA methylation might have an important role in reg- 
ulating the expression of lineage-specific genes and maintaining the 
segregated cell fates of different lineages. 

We reconstituted the transcriptome and DNA methylome land- 
scapes of human implantation at single-cell resolution and uncov- 
ered key developmental events, the molecular dynamics of which was 
previously unresolved. Although many potential differences may exist 
between in vivo and in vitro implantation systems, this study provides 
a potential basis for the development of better strategies to mimic 
this unique process in vitro. A better understanding of the implanta- 
tion process will also provide valuable information—such as lineage- 
specific transcription-factor networks and signalling pathway charac- 
teristics (Supplementary Methods)—for the derivation and directed 
differentiation of pluripotent stem cells in vitro, which could be a 
potentially invaluable resource for reproductive and regenerative 
medicine. 


Reporting summary 
Further information on research design is available in the Nature Research 
Reporting Summary linked to this paper. 
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Extended Data Fig. 1 | See next page for caption. 


Extended Data Fig. 1 | Lineage and sex identification of human pre- and 
post-implantation embryos. Related to Fig. 1. a-d, Immunofluorescence 
images of human embryos at different developmental stages (n = 3). Scale 
bars, 40 jm (a), 30 pm (b), 50 jum (c), 40 jum (d). e, Cell numbers of both 
sexes at each embryonic day. In total, 7,636 single cells were included. Day 
(D)6, n = 1,029 (filtered, n = 296; female, n = 207; male, n = 526). Day 

8, n = 2,260 (filtered, n = 183; female, n = 1,288; male, n = 789). Day 

10, n = 2,516 (filtered, n = 806; female, n = 1,022; male, n = 688). Day 

12, n = 1,183 (filtered, n = 329; female, n = 281; male, n = 573). Day 14, 

n = 648 (filtered, n = 111; female, n = 0; male, n = 537). f, The number of 
expressed genes in each individual cell for different developmental stages. 
On average, 7,100 expressed genes and 139,792 transcripts were detected 
in each individual cell. Black lines indicate median values, the boxes range 
from the 25th to 75th percentiles and the whiskers correspond to 1.5x the 
IQR. In total, 5,911 single cells were included (day 6, n = 733; 

day 8, n = 2,077; day 10, n = 1,710; day 12, n = 854; day 14, n = 537). 

g, The number of cells and embryos of each cell lineage at distinct stages 
after filtering. E, embryo. h, The average mean levels of genes located on 
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chromosome X (pink colour) and Y (blue colour) for each embryo. Mean 
expression ratios between X and Y are shown in green. Cells are ordered 
by sex and embryonic day. In total, 23 embryos with a sex expression ratio 
above two are defined as female embryos (2,798 cells from 4 stages), the 
remaining 25 embryos were male embryos (3,113 cells from 5 stages). The 
sex of the embryos highlighted in orange was confirmed by single-cell 
whole-genome sequencing. i, j, The unsupervised t-SNE plot of all cells at 
five representative stages, revealing a developmental path and cell lineage 
identification. i, In total, 5,911 cells were included (day 6, n = 733; day 8, 
n = 2,077; day 10, n = 1,710; day 12, n = 854; day 14, n = 537). j, Cells 
were identified as EPI, PE, TE and ysTE cells. Cells (dots) are coloured 
according to embryo stage and original lineage identity. EPI, n = 330; 

PE, n = 179; TE, n = 5,363; ysTE, n = 39. Clusters were assigned to 
indicate cell lineages using known lineage-specific markers. k, Lineage 
identification was further confirmed by lineage score analysis. The 

ggtern plot shows the lineage scores for each individual cell, calculated 
using previously published lineage-specific genes”. The cells are coloured 
according to their lineage identity. 


LETTER 


a b PE 


POUSF1 NANOG SOX2 GATA4 PDGFRA FOXA2 


N N 
Pa | Pa 25 
a a 00 
tSNE1 tSNE1 
c TE d 
GATA3 DAB2 TFAP2C OTx2 
10: 
a 
Zz 
g 5 
tSNE1 EPI PE TE ysTE 
e 
Lineage [J ;2 Lineage 
Day EPI 
10 fim PE 
POUSF1 - ie 
ysTE 
SOX2 
6 Day 
NANOG  , | D6 
D8 
GATA4 2 | D10 
“ D12 
PDGFRA ~ 0 D14 
FOXA2 
DAB2 
HM | 
| TFAP2C 
\ CDX2 
SOX17 
ysTE 
Lineage 
Day 
POUSF1 
SOX2 
NANOG 
GATA4 
PDGFRA 
FOXA2 
DAB2 
GATA3 
TFAP2C 
ie CDxX2 
SOX17 
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the expression levels of conventional lineage markers. Each dot represents n = 39. e, The expression levels of conventional lineage markers at single- 
a single cell, and the cells are coloured based on the expression levels cell levels. The cells that clustered away from the other three main cell 
(logo(TPM + 1)) of several known gene markers for specific lineages. clusters showed high expression of TE markers (for example, GATA3) and 
For example, the EPI population expressed the key pluripotency markers low expression of EPI (for example, SOX2) and PE (for example, GATA4) 
POUSF1 (encoding OCT4), NANOG and SOX2. a-c, Colours from markers. Colours from blue to red represent expression levels from low to 
yellow to red represent expression levels from low to high. d, The violin high. 


plots show the expression of OTX2 in three main lineages; by setting 


Til mi way, 


i { LA 


a 1 
4 rom 


ach 


LETTER 


Human lineage 

12 EP 
10 PE 
TE 


Embryonic day 
(Nakamura, et al., 2016) 


4 E06 
E07 

2 
1 E08 
0 E09 


i Embryonic. Day 


E13 
E14 
E16 
E17 


Monkey lineage 


(Nakamura, et al., 2016) 
Pre-EPI 


PreE-TE 


eS PreL-TE 


Vil naa Ti 


EXMC 
Gast1 
Gast2a 
Gast2b 
Hypo blast 
ICM 
Post-paTE 
PostE-EPI 
PostL-EPI 


im nee mee my “i 


Sil unl 1d ha ae ce 


R 
HN 
Timi Van 
GA 
=m BUUIEEY | 1 1EE LE PIN 
l RAB31 


| ft 


aS tl 


a a i bak 
tC he — 
mit wa ee 11 \ 
ell rent am 
| _— i ta | . ; iy! 
ene Ul i a as TEL 1 mm 1) 
van it ili, a 
TUM UNM Wee wd vi 
, = I i wa, Mt " anit {i F Ih 
| i ve Th thi appeal a i mM | Wd rau iH] f! if | 
wu bd Si ii lt! te i Wt 4} Video nl 


mein teem a ares 


Till vei I fF 


th iy With i tail + (hc Ahn 


MW! inl | iy hs : he \ LTTE aT Hf HENGE TS fa AUT a aEIpA 


Bil! th TAS 1 yul 


he mi i. il 


== 


Extended Data Fig. 3 | Human lineage-specific gene-expression 
patterns on previously reported monkey data. The projection of 
signature genes for each lineage onto the monkey-related populations!” 
around implantation stages, including DPPA5, IFITM1 and MEG3 in EPI 
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LETTER 


EPI 


PC3 


PC3 


fom 


EPI 


Component 2 


PCc2 


© hm_D10_E9 
© hv_D10_E7 
© hv_D10_E8 


@ ha_D12_E1 
© ha_D12_E3 


“| @ ha_D14_E1 


@ hv_D12-E1 
@ hv_D12-E2 
@ hv_D12-E3 


hv_D14_E1 


tSNE2 


Gene No 
10,000 
7,500 
5,000 


Component 1 


tSNE2 


Clusters 
°C1 


PC2 


Day © D6 © D8 © D10 @ D12 


Extended Data Fig. 4 | See next page for caption. 


Day 


© D6 
° D8 
°D10 
eD12 


Extended Data Fig. 4 | Lineage-specific gene-expression dynamics 
during implantation. a, Principal component analysis for each lineage. 
Only cells that came from embryos that contained all three major lineages 
were used. In total, 3,145 single cells were included: EPI, n = 282; PE, 

n = 138; TE, n = 2,725. b, The developmental trajectory of each lineage 
based on Monocle2. Only cells that came from embryos that contained 
all three major lineages were used. In total, 3,145 single cells were used. 
EPI, n = 282; PE, n = 138; TE, n = 2,725. c-e, t-SNE analysis of EPI 
cells at all four stages revealed three clusters. c, Clusters are shown for 
each of the days. d, We found that the main reason one minor cluster 
(in grey) of these three clusters was separated might be owing to the 
differences in the number of expressed genes, although all individual 
cells from these clusters passed our quality control in the first procedure 
of data processing. We could not exclude that this was caused by low 
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transcriptional activity of this cluster of cells or technical limitations 

in our system. e, We therefore removed this cluster of cells and focused 

on the differentially expressed genes between those two main clusters 

(C1 and C2). c-e, In total, 282 single cells were included. f, The results 
showed that the differentially expressed genes in the population were 
basically consistent with that in the internal EPI stages, reflecting that the 
differences in gene-expression characteristics between the two clusters 

are mainly due to the diversity in developmental stages (days 6 and 8 
compared with days 10 and 12), indicating that EPI subgroups mainly 
reflected the developmental stage-specific differences. The gene list related 
to f is provided in Supplementary Table 5. g, Principal component analysis 
for TE at different developmental stages. The sublineages of TE emerge 
around day 10. Day 6 TE, n = 667 cells; day 10 TE, n = 1,443 cells; days 

12 and 14 TE, n = 792 cells (day 12) and 533 cells (day 14). 
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Extended Data Fig. 5 | Differential gene-expression features of the two cells; day 14, n = 533 cells. b, Differential gene-expression features of the 
TE clusters. a, TE cells from day-12 and day-14 embryos were divided two TE clusters. 

into two clusters. In total, 1,325 single cells were included. Day 12, n = 792 
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Extended Data Fig. 6 | Representative CNV patterns during 
implantation. a, Heat map shows large-scale CNVs in individual cells 
(rows) from a day-6 embryo based on single-cell RNA-sequencing data. 
The majority of cells from this embryo contained a whole-chromosomal 
duplication of chromosome 17, and a portion of the cells had a whole- 
chromosomal deletion of chromosome 7. b, CNVs confirmed by single- 
cell whole-genome sequencing. Related to a. c, Representative CNV- 
chimeric embryo. d, Heat map showing large-scale CNVs in individual 
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cells (rows) from a day-8 embryo based on single-cell RNA-sequencing 
data. The majority of cells from this embryo contained a whole- 
chromosomal deletion of chromosome 22, and a portion of the cells had 
whole-chromosomal deletion of chromosome 13. Cells also show different 
chromosome X patterns at the transcriptome level. e, CNVs confirmed by 
single-cell whole-genome sequencing. Related to d. f, g, Heat maps show 
large-scale CNVs in individual cells (rows) from two day-12 embryos 
based on single-cell RNA-sequencing data. 
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Extended Data Fig. 7 | See next page for caption. 


Extended Data Fig. 7 | Dynamics of chromosome X dosage during 
implantation. Related to Fig. 3. a, The y axis represents the ratio of total 
expression levels of genes located on chromosome | and the same number 
of genes located on other autosomes (ChrA). Black lines indicate median 
values, the boxes range from the 25th to 75th percentiles and the whiskers 
correspond to 1.5x the IQR. In total, 3,184 single cells were included. Day 
6, n = 387 (male); day 8, n = 1,525 (female, n = 1,147; male, n = 378); day 
10, n = 1,021 (female, n = 917; male, n = 104); day 12, n = 251 (female, 
n= 104; male, n = 147) (Supplementary Table 1). b, The expression ratios 
of genes located on chromosome 1 or chromosome X to other autosomes 
in the different embryos that we sequenced. In total, 3,184 single cells were 
included: female, n = 2,168 single cells from 13 embryos; male, n = 1,016 
single cells from 8 embryos. The statistical test was a two-sided t-test. 

c, The ratio of chromosome X to other autosomes in adult human digestive 
tract was used as a control. P, patient. P1, n = 763 cells; P2, n = 700 cells. 
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The statistical test was a two-sided t-test. d, XIST expression in cells with 
different sexes. In total, 3,184 single cells were included; female, n = 2,168; 
male, n = 1,016. e, The distribution of XIST expression levels across 
different developmental stages for male and female embryos. In total, 
3,184 single cells were included: day 6, n = 387 (male); day 8, n = 1,525 
(female, n = 1,147; male, n = 378); day 10, n = 1,021 (female, n = 917; 
male, n = 104); day 12, n = 251 (female, n = 104; male, n = 147). f, XIST 
expression in different lineages for male and female embryos. In total, 
3,184 single cells were included: female, n = 2,168; male, n = 1,016 

(see details in Supplementary Table 1). g, The expression levels of XIST 
and XACT for different lineages. EPI, n = 282 cells; PE, n = 138 cells; TE, 
n = 2,725 cells. h, The distribution of XIST and XACT expression levels for 
different lineages. Both, cells that expressed XIST and XACT; XIST, cells 
that expressed only XIST; XACT, cells that expressed only XACT. 

EPI, n = 282 cells; PE, n = 138 cells; TE, n = 2,725 cells. 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | DNA methylation patterns during implantation. 
a, Total number of embryos and cells collected at each stage (days 6-12) 
for single-cell Trio-seq analysis. b, t-SNE plot of cells based on the 
expression matrix. Each dot represents one cell and colours represent 
lineage types. For each lineage, we selected several individual cells for 
bisulfite sequencing. In total, 2,544 cells were included: EPI, n = 79 cells; 
PE, n = 136 cells; TE, n = 2,329 cells. c, Principal component analysis of 
cells based on the expression matrix. Only cells that were also used for 
bisulfite sequencing were used for the analysis. In total, 130 cells were 
used: EPI, n = 31 cells; PE, n = 45 cells; TE, n = 54 cells. d-f, The number 
of CpG sites detected with at least one-, three- and fivefold coverage across 
the single-cell samples. In total 2.7 Tb of sequencing data was generated, 
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and on average 10 million CpG sites per cell were covered. Black line, 
median value. EPI, n = 31 cells; PE, n = 45 cells; TE, n = 54 cells. 

g, h, t-SNE analysis based on promoter methylation levels of genes. Each 
dot represents one cell and cells were coloured by culture day and lineage 
identity. EPI, n = 31 cells; PE, n = 45 cells; TE, n = 54 cells. i, Single-cell 
DNA methylation levels across gene bodies (from transcription start site 
(TSS) to transcription end site (TES)) and the 20-kb flanking regions 

of the transcription start and end sites. We found shared distribution 
patterns: genes were hypomethylated around the transcription start sites, 
evenly hypermethylated in the gene body regions and significant decrease 
in methylation after the transcription end sites. 
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Extended Data Fig. 9 | The CNV landscapes of human early embryos for euploid and aneuploid cells. Black lines indicate median values, the 


by the DNA methylome dataset. a, The CNVs for each individual cell at boxes range from the 25th to 75th percentiles and the whiskers correspond 
different developmental stages. Red represents duplication, blue represents _ to 1.5x the IQR. EPI, n = 61 cells (euploid, n = 31; aneuploidy, n = 30); 
deletion and white represents diploid. b, Representative examples of PE, n = 84 cells (euploid, n = 45; aneuploidy, n = 39); TE, n = 141 cells 
euploid and aneuploid single-cell samples. Abnormal copy numbers are (euploid, n = 54; aneuploidy, n = 87) (see details in Supplementary 
highlighted in red (duplication) or blue (deletion). c, The number of cells Table 1). 

with CNVs for each chromosome. d, The global DNA methylation levels 
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Extended Data Fig. 10 | The re-methylation levels of different genomic 
elements vary between the three lineages at the different developmental 
stages. a, Global methylation levels on different genomic elements for 
different cell lineages. EPI, n = 31 cells; PE, n = 45 cells; TE, n = 54 

cells. b, The average DNA methylation levels of promoter regions 

(250 bp upstream to 250 bp downstream of the transcription start site) 

for each lineage at different developmental stages. Only promoters for 
which methylation levels were less than 0.1 at day 6 and more than 0.35 


ELFS 


EPI 


at days 10, 12 and 14 are shown in the heat map. Colours from blue to 
red represent methylation levels from low to high. EPI, n = 31 cells, PE, 
n= 45 cells; TE, n = 54 cells. c, DNA methylation levels of represent loci 
for lineage-specific genes at promoter regions. Each column represents 
one read. Red represents a methylated CpG site, blue represents an 
unmethylated CpG site and white represents an undetected site. Only 
reads that covered at least five CpG sites are shown in the heat map. 
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n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


a The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


— For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection No special or proprietary software was used. 


Data analysis We used available pipelines and software to analyze single-cell RNA-Seq data and single-cell PBAT data. 
R Packages Version: 
R-3.5.1 scater_1.8.4 Seurat_2.3.4 ggplot2_3.1.0 FactoMineR_1.41 Rtsne_0.15 monocle_2.8.0 libpheatmap_1.0.10 reshape2_1.4.3 
DDRTree_0.1.5 tsne_0.1-3 SingleCellExperiment_1.2.0 S4Vectors_0.20.1 IRanges_2.16.0 SummarizedExperiment_1.10.1 scales_1.0.0 
beeswarm_0.2.3 ggbeeswarm_0.6.0 tsne_0.1-3 igraph_1.2.2 HSMMSingleCell_0.114.0 RColorBrewer_1.1-2 fastICA_1.2-1 gplots_ 3.0.1 
SCENIC_1.0.0-03 ggtern_3.1.0 
fpc_2.1-11.1 pheatmap_1.0.10 
Other Packages: 
tophat-2.0.14.Linux_x86_64 samtools-1.2(RNA) & samtools-0.1.18(WGS & PBAT) python2.7 HTSeq-0.6.1 cufflinks-2.2.1 picard- 
tools-1.130 GenomeAnalysisTK.jar(3.4-46-gbc02625) snpEff_v4.3 TrimGalore-0.5.0 bwa-0.7.12 jdk1.8.0_151 bismark_v0.7.6 
bowtie-1.0.0 methpipe-3.4.3 cgmaptools-0.1.1 pyscenic-0.8.9 
See details in Supplementary information_Methods. 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 
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Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


Scripts of the main steps of the analysis are provided at GitHub website (https://github.com/WRui/Post_Implantation). Other R scripts associated with graphic 
presentation are available from the authors on reasonable request. 
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Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size The sample size was determined when the main cell lineages at each developmental stages were captured. Related statistical analysis 
provides the rationale for sufficiency of the sample sizes. 


Data exclusions | For sequencing data, we excluded low-quality cells according to the criteria in Supplementary information_Methods. 
Replication All attempts at replication were successful. 


Randomization — Not applicable since all single cells were allocated into different groups according to their development 
stages. 


Blinding Not applicable since no specific grouping. 


Behavioural & social sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Study description Briefly describe the study type including whether data are quantitative, qualitative, or mixed-methods (e.g. qualitative cross-sectional, 
quantitative experimental, mixed-methods case study). 


Research sample State the research sample (e.g. Harvard university undergraduates, villagers in rural India) and provide relevant demographic information 
(e.g. age, sex) and indicate whether the sample is representative. Provide a rationale for the study sample chosen. For studies involving 
existing datasets, please describe the dataset and source. 


Sampling strategy Describe the sampling procedure (e.g. random, snowball, stratified, convenience). Describe the statistical methods that were used to 
predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale 
for why these sample sizes are sufficient. For qualitative data, please indicate whether data saturation was considered, and what criteria 
were used to decide that no further sampling was needed. 


Data collection Provide details about the data collection procedure, including the instruments or devices used to record the data (e.g. pen and paper, 
computer, eye tracker, video or audio equipment) whether anyone was present besides the participant(s) and the researcher, and whether 
the researcher was blind to experimental condition and/or the study hypothesis during data collection. 


Timing Indicate the start and stop dates of data collection. If there is a gap between collection periods, state the dates for each sample cohort. 


Data exclusions If no data were excluded from the analyses, state so OR if data were excluded, provide the exact number of exclusions and the rationale 
behind them, indicating whether exclusion criteria were pre-established. 


Non-participation State how many participants dropped out/declined participation and the reason(s) given OR provide response rate OR state that no 
participants dropped out/declined participation. 


Randomization If participants were not allocated into experimental groups, state so OR describe how participants were allocated to groups, and if 
allocation was not random, describe how covariates were controlled. 
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Ecological, evolutionary & environmental sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Study description 


Research sample 


Sampling strategy 


Data collection 


Timing and spatial scale 


Briefly describe the study. For quantitative data include treatment factors and interactions, design structure (e.g. factorial, nested, 
hierarchical), nature and number of experimental units and replicates. 


Describe the research sample (e.g. a group of tagged Passer domesticus, all Stenocereus thurberi within Organ Pipe Cactus National 
Monument), and provide a rationale for the sample choice. When relevant, describe the organism taxa, source, sex, age range and 
any manipulations. State what population the sample is meant to represent when applicable. For studies involving existing datasets, 


describe the data and its source. 


Note the sampling procedure. Describe the statistical methods that were used to predetermine sample size OR if no sample-size 
calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient. 


Describe the data collection procedure, including who recorded the data and how. 


Indicate the start and stop dates of data collection, noting the frequency and periodicity of sampling and providing a rationale for 


these choices. If there is a gap between collection periods, state the dates for each sample cohort. Specify the spatial scale from which 
the data are taken 


If no data were excluded from the analyses, state so OR if data were excluded, describe the exclusions and the rationale behind them, 
indicating whether exclusion criteria were pre-established. 


Data exclusions 


Describe the measures taken to verify the reproducibility of experimental findings. For each experiment, note whether any attempts to 
repeat the experiment failed OR state that all attempts to repeat the experiment were successful. 


Reproducibility 


Randomization Describe how samples/organisms/participants were allocated into groups. If allocation was not random, describe how covariates were 


controlled. If this is not relevant to your study, explain why. 


Blinding Describe the extent of blinding used during data acquisition and analysis. If blinding was not possible, describe why OR explain why 
blinding was not relevant to your study. 
Did the study involve field work? Yes [| No 


Field work, collection and transport 


Field conditions Describe the study conditions for field work, providing relevant parameters (e.g. temperature, rainfall). 


Location State the location of the sampling or experiment, providing relevant parameters (e.g. latitude and longitude, elevation, water 
depth). 


Describe the efforts you have made to access habitats and to collect and import/export your samples in a responsible manner and 
in compliance with local, national and international laws, noting any permits that were obtained (give the name of the issuing 
authority, the date of issue, and any identifying information). 


Access and import/export 


Disturbance Describe any disturbance caused by the study and how it was minimized. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 


Eukaryotic cell lines Flow cytometry 


Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Clinical data 
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Antibodies 


Antibodies used Oct3/4 antibody (A1515, Santa Cruz Biotechnology, sc-5279, 1/200); 
OTX2 antibody (KNO0314101, AF199, 1/200) 
F-actin/AlexaFluor 488 phalloid (JLM0O211061, Thermo Fisher Scientific, A12379); 
GATA6 antibody (JLM0211061, R&D Systems, mab1700). 
donkey anti-mouse AlexaFluor®568 (Thermo Fisher Scientific, A10037); 
donkey anti-rabbit AlexaFluor®647 (Thermo Fisher Scientific, A31573). 


Validation All the antibodies used in this study were commercial antibodies and were only used for applications, with validation procedures 
described on the following sites of the manufacturers: 
https://www.thermofisher.com; https://www.rndsystems.com; https://www.scbt.com/scbt/home 


Eukaryotic cell lines 
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Policy information about cell lines 


Cell line source(s) State the source of each cell line used. 
Authentication Describe the authentication procedures for each cell line used OR declare that none of the cell lines used were authenticated. 
Mycoplasma contamination Confirm that all cell lines tested negative for mycoplasma contamination OR describe the results of the testing for 


mycoplasma contamination OR declare that the cell lines were not tested for mycoplasma contamination. 


Commonly misidentified lines ame any commonly misidentified cell lines used in the study and provide a rationale for their use. 
(See ICLAC register) 


Palaeontology 


Specimen provenance Provide provenance information for specimens and describe permits that were obtained for the work (including the name of the 
issuing authority, the date of issue, and any identifying information). 


Specimen deposition Indicate where the specimens have been deposited to permit free access by other researchers. 
Dating methods If new dates are provided, describe how they were obtained (e.g. collection, storage, sample pretreatment and measurement), 


where they were obtained (i.e. lab name), the calibration program and the protocol for quality assurance OR state that no new 
dates are provided. 


Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information. 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals For laboratory animals, report species, strain, sex and age OR state that the study did not involve laboratory animals. 


Wild animals Provide details on animals observed in or captured in the field; report species, sex and age where possible. Describe how animals 
were caught and transported and what happened to captive animals after the study (if killed, explain why and describe method; if 
released, say where and when) OR state that the study did not involve wild animals. 


Field-collected samples For laboratory work with field-collected samples, describe all relevant parameters such as housing, maintenance, temperature, 
photoperiod and end-of-experiment protocol OR state that the study did not involve samples collected from the field. 


Ethics oversight Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or 
guidance was required and explain why not. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics The average age of oocyte donors is 32 and the sperm donor is 41 years old. The oocyte donors involved in this study are fertile 
with at least one healthy baby. The oocyte donors with normal BMI are in good health. The healthy sperm donor with 
demonstrated fertility has normal semen parameters. 


Recruitment Research donors were recruited from Peking University Third Hospital. Before giving consent, donors have a suitable opportunity 
to receive proper counselling about the implications of the donation and potential risks. Gametes and embryos were collected 
with written informed consent from the donors in this study. 


Ethics oversight The Reproductive Medicine Ethics Committee of Peking University Third Hospital 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Clinical data 


Policy information about clinical studies 
All manuscripts should comply with the ICMJE guidelines for publication of clinical research and a completed CONSORT checklist must be included with all submissions. 


Clinical trial registration Provide the trial registration number from ClinicalTrials.gov or an equivalent agency. 

Study protocol Note where the full trial protocol can be accessed OR if not available, explain why. 

Data collection Describe the settings and locales of data collection, noting the time periods of recruitment and data collection. 
Outcomes Describe how you pre-defined primary and secondary outcome measures and how you assessed these measures. 


> 
jad) 
a 
e 
= 
o 
= 
o 
Za) 
© 
je’) 
= 
a 
= 
= 
io 
ne) 
Oo 
ee 
= 
Co) 
a) 
S 
3 
= 
fev) 
= 
< 


ChIP-seq 


Data deposition 


Confirm that both raw and final processed data have been deposited in a public database such as GEO. 


Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks. 


Data access links For "Initial submission" or "Revised version" documents, provide reviewer access links. For your "Final submission" document, 
May remain private before publication. provide a link to the deposited data. 
Files in database submission Provide a list of all files available in the database submission. 
Genome browser session Provide a link to an anonymized genome browser session for "Initial submission" and "Revised version" documents only, to 
(e.g. UCSC) enable peer review. Write "no longer applicable" for "Final submission" documents. 
Methodology 
Replicates Describe the experimental replicates, specifying number, type and replicate agreement. 
Sequencing depth Describe the sequencing depth for each experiment, providing the total number of reads, uniquely mapped reads, length of 


reads and whether they were paired- or single-end. 


Antibodies Describe the antibodies used for the ChIP-seq experiments; as applicable, provide supplier name, catalog number, clone 
name, and lot number. 


Peak calling parameters Specify the command line program and parameters used for read mapping and peak calling, including the ChIP, control and 
index files used. 


Data quality Describe the methods used to ensure data quality in full detail, including how many peaks are at FDR 5% and above 5-fold 
enrichment. 


Software Describe the software used to collect and analyze the Ch/P-seq data. For custom code that has been deposited into a 
community repository, provide accession details. 


Flow Cytometry 


Plots 


Confirm that: 


The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 


The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers). 


All plots are contour plots with outliers or pseudocolor plots. 


A numerical value for number of cells or percentage (with statistics) is provided. 


Methodology 


Sample preparation Describe the sample preparation, detailing the biological source of the cells and any tissue processing steps used. 


Instrument Identify the instrument used for data collection, specifying make and model number. 


Software Describe the software used to collect and analyze the flow cytometry data. For custom code that has been deposited into a 
community repository, provide accession details. 


Cell population abundance __| Describe the abundance of the relevant cell populations within post-sort fractions, providing details on the purity of the samples 
and how it was determined. 


Gating strategy Describe the gating strategy used for all relevant experiments, specifying the preliminary FSC/SSC gates of the starting cell 
population, indicating where boundaries between "positive" and "negative" staining cell populations are defined. 


[ ] Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 


Magnetic resonance imaging 


Experimental design 


Design type Indicate task or resting state; event-related or block design. 
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Design specifications Specify the number of blocks, trials or experimental units per session and/or subject, and specify the length of each trial 
or block (if trials are blocked) and interval between trials. 


Behavioral performance measures _| State number and/or type of variables recorded (e.g. correct button press, response time) and what statistics were used 
to establish that the subjects were performing the task as expected (e.g. mean, range, and/or standard deviation across 


subjects). 
Acquisition 

Imaging type(s) Specify: functional, structural, diffusion, perfusion. 

Field strength Specify in Tesla 

Sequence & imaging parameters Specify the pulse sequence type (gradient echo, spin echo, etc.), imaging type (EPI, spiral, etc.), field of view, matrix size, 
slice thickness, orientation and TE/TR/flip angle. 

Area of acquisition State whether a whole brain scan was used OR define the area of acquisition, describing how the region was determined. 

Diffusion MRI Used Not used 


Preprocessing 


Preprocessing software Provide detail on software version and revision number and on specific parameters (model/functions, brain extraction, 
segmentation, smoothing kernel size, etc.). 


Normalization If data were normalized/standardized, describe the approach(es): specify linear or non-linear and define image types 
used for transformation OR indicate that data were not normalized and explain rationale for lack of normalization. 


Normalization template Describe the template used for normalization/transformation, specifying subject space or group standardized space (e.g. 
original Talairach, MNI305, ICBM152) OR indicate that the data were not normalized. 


Noise and artifact removal Describe your procedure(s) for artifact and structured noise removal, specifying motion parameters, tissue signals and 
physiological signals (heart rate, respiration). 


Volume censoring Define your software and/or method and criteria for volume censoring, and state the extent of such censoring. 


Statistical modeling & inference 


Model type and settings Specify type (mass univariate, multivariate, RSA, predictive, etc.) and describe essential details of the model at the first 
and second levels (e.g. fixed, random or mixed effects; drift or auto-correlation). 


Effect(s) tested Define precise effect in terms of the task or stimulus conditions instead of psychological concepts and indicate whether 
ANOVA or factorial designs were used. 

Specify type of analysis: Whole brain ROl-based | _] Both 

Statistic type for inference Specify voxel-wise or cluster-wise and report all relevant parameters for cluster-wise methods. 


(See Eklund et al. 2016) 


Correction Describe the type of correction and how it is obtained for multiple comparisons (e.g. FWE, FDR, permutation or Monte 
Carlo). 


Models & analysis 


n/a | Involved in the study 


Functional and/or effective connectivity 


Graph analysis 


Multivariate modeling or predictive analysis 


Functional and/or effective connectivity Report the measures of dependence used and the model details (e.g. Pearson correlation, partial 
correlation, mutual information). 


Graph analysis Report the dependent variable and connectivity measure, specifying weighted graph or binarized graph, 
subject- or group-level, and the global and/or node summaries used (e.g. clustering coefficient, efficiency, 
etc.). 


Multivariate modeling and predictive analysis Specify independent variables, features extraction and dimension reduction, model, training and evaluation 
metrics. 
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Microbiota-derived lantibiotic restores resistance 
against vancomycin-resistant Enterococcus 


Sohn G. Kim!?, Simone Becattini!, Thomas U. Moody!, Pavel V. Shliaha*, Eric R. Littmann’, Ruth Seok!, Mergim Gjonbalaj', 
Vincent Eaton!, Emily Fontana?, Luigi Amoretti?, Roberta Wright’, Silvia Caballero!’, Zhong-Min X. Wang!, Hea-Jin Jung!, 


Sejal M. Morjaria®, Ingrid M. Leiner), Weige Qin®, Ruben J. J. F Ramos*, Justin R. Cross®, Seiko Narushima’, Kenya Honda 


7,8,9 
) 


Jonathan U. Peled?!°, Ronald C. Hendrickson*"!, Ying Taur®, Marcel R. M. van den Brink!*!° & Eric G. Pamer!?3>* 


Intestinal commensal bacteria can inhibit dense colonization of the 
gut by vancomycin-resistant Enterococcus faecium (VRE), a leading 
cause of hospital-acquired infections’. A four-strained consortium 
of commensal bacteria that contains Blautia producta BPscsx 
can reverse antibiotic-induced susceptibility to VRE infection*. 
Here we show that BPscsx reduces growth of VRE by secreting a 
lantibiotic that is similar to the nisin-A produced by Lactococcus 
lactis. Although the growth of VRE is inhibited by BPscsx and 
L. lactis in vitro, only BPscsx colonizes the colon and reduces VRE 
density in vivo. In comparison to nisin-A, the BPscsx lantibiotic has 
reduced activity against intestinal commensal bacteria. In patients 
at high risk of VRE infection, high abundance of the lantibiotic 
gene is associated with reduced density of E. faecium. In germ-free 
mice transplanted with patient-derived faeces, resistance to VRE 
colonization correlates with abundance of the lantibiotic gene. 
Lantibiotic-producing commensal strains of the gastrointestinal 
tract reduce colonization by VRE and represent potential probiotic 
agents to re-establish resistance to VRE. 

Preventing transmission of highly antibiotic-resistant pathogens 
in healthcare settings remains problematic‘. A promising approach 
to reducing antibiotic-resistant infections involves enhancing micro- 
biota-mediated colonization resistance of the host by administering 
protective commensal bacteria®. Although mechanisms of coloniza- 
tion resistance are being discovered, few bacterial strains that mediate 
resistance have been identified®. Faecal microbiota transplantation 
(FMT), although effective for recurrent Clostridium difficile infection’, 
remains problematic because faecal compositions can be highly vari- 
able. Preclinical studies suggest that commensal bacterial strains that 
inhabit the lower gastrointestinal tract can be effective at providing 
resistance?®". 

Enterococci colonize the human gastrointestinal tract and have 
developed resistance to antibiotics, including vancomycin!”. 
Antibiotic-mediated depletion of the gut microbiota leads to expan- 
sion of VRE in the intestine, predisposing patients to bloodstream 
infections®!”!3. In mice, FMT can re-establish colonization resist- 
ance and reduce intestinal VRE density!*'°. We recently described a 
four-strain-consortium named CBBPgcsx, consisting of Clostridium 
bolteae, Blautia producta (BPscsx; SCSK refers to the Blautia strain, 
which was characterized by S.C. and S.G.K.), Bacteroides sartorii and 
Parabacteroides distasonis, that restored colonization resistance against 
VRE in antibiotic-treated mice’. 

To determine the mechanism of CBBPscsx-mediated VRE inhibition, 
we co-cultured each strain with VRE (Fig. la, Extended Data Fig. la—d). 
BPscsx inhibited VRE growth, as did BPscsx-conditioned media 


(Extended Data Fig. le-i), and dilution experiments demonstrated that 
BPscsx-mediated inhibition is not due to nutrient depletion. In con- 
trast to BPscsx-conditioned media, culture supernatants of B. producta 
(Clostridiales VE202-06 (BP ontrol)) and other microbiota-derived 
Blautia species did not inhibit VRE growth (Extended Data Fig. 1j, 
Supplementary Tables 1, 2). 

Previous studies demonstrated that BPscsx requires the other 
CBBPscsx members to colonize the intestine®. CBBPscsx, but not a 
modified consortium in which BPscsx was replaced with BP control 
(CBBPontrol)s reduced VRE colonization (Fig. 1b), even though 
both consortia colonized mice (Extended Data Fig. 2a, b). CBBPscsx 
also reduced VRE colonization in gnotobiotic mice (Extended Data 
Fig. 2c, d). CBBPscsx reduced colonization by several VRE strains 
(Extended Data Fig. 2e-g, Supplementary Table 3) and fluorescence 
in situ hybridization analysis demonstrated BPscsx colonization 
throughout the large intestine (Extended Data Fig. 3). 

To determine whether BPscsx produces an inhibitory factor, VRE 
was cultured in caecal contents from mice reconstituted with CBBPscsx 
or CBBP control (Extended Data Fig. 4a). Only CBBPscsx caecal con- 
tents inhibited VRE growth. Previous studies demonstrated that the 
commensal microbiota stimulates secretion of a host-derived antimi- 
crobial peptide’®, such as ReglIIy, which reduces intestinal VRE coloni- 
zation!’. CBBP colonization, however, did not induce Reg3g transcripts 
or RegIII-y protein in the ileum of antibiotic-treated mice (Extended 
Data Fig. 4b, c). Host-derived antimicrobial peptides and inflamma- 
tory genes did not differ between mice treated with CBBPscsx or PBS 
(Extended Data Fig. 4d-i). CBBPscsx was effective at reducing VRE 
density in Rag2~’~Il2rg-'~ mice, indicating that T cells, B cells, natural 
killer cells and innate lymphoid cells do not contribute to CBBPscsx- 
mediated VRE inhibition (Extended Data Fig. 4j). 

VRE was inhibited by proteins precipitated from BPscsx- but not 
BPcontrol-conditioned media (Extended Data Fig. 5a), which suggests 
that BPscsx secretes an inhibitor. We performed whole-genome sequenc- 
ing of BPscsx and BP control and discovered that only BPscsx contains an 
operon for a lantibiotic, a lanthionine-containing antimicrobial peptide 
(Extended Data Fig. 5b-d, Supplementary Tables 4, 5). Lanthionines 
are formed by enzymatic dehydration of serine or threonine residues 
that cyclize with neighbouring cysteine residues!*'°. Nisin-A, a lan- 
tibiotic expressed by L. lactis”®”!, binds lipid II and inhibits the syn- 
thesis of peptidoglycan and also forms a membrane pore complex”. 
Comparison of the lantibiotic operons from BPscsx and L. lactis 
(lan and nis, respectively) revealed homologous sequences for all genes 
except dissimilar signal peptidase sequences (Extended Data Fig. 5c, 
Supplementary Table 5). Although gene organization and number 
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Fig. 1 | BPscsx expresses a lantibiotic in vivo that inhibits VRE. a, VRE 
was co-cultured in vitro with each CBBPgcgx isolate (n = 15 biologically 
independent samples from three independent experiments) and growth 
was monitored. CFU, colony-forming unit; LOD, limit of detection. 

b, Antibiotic-treated, VRE-dominated mice (n = 12 mice from three 
independent experiments) received treatment by oral gavage containing 
CBBPscsx, CBBP control, or PBS. VRE colonization was monitored by CFU 
quantification in faecal samples. c, VRE was inoculated in culture broth 
with commercial nisin-A (100 1M), purified LanA;-LanA, lantibiotic 


within the lan and nis operons differ, a lantibiotic operon recently char- 
acterized”? in Blautia obeum is similar to that of BPscsx. Notably, BPscsx 
encodes five lantibiotic precursor genes (lanA;-lanAs), in contrast to one 
encoded by the Nis operon (nisA). The first four precursor sequences 
(lanA ,-lanA,4) are identical, whereas the fifth precursor (JanA;5) encodes 
a similar but non-identical sequence (Supplementary Table 4). lanA;- 
lanA, and nisA belong to a lantibiotic subset that contains the gallider- 
min superfamily domain, which conserves two N-terminal lanthionine 
rings enabling lipid II binding” and inhibitory activity”. 

Nisin-A and other lantibiotics of the gallidermin superfamily carry a 
net positive charge, which enables electrostatic interactions with the cell 
membrane and lipid II’. The inhibitory factor of BPscsk and nisin-A 
elute similarly during cation exchange chromatography, which sug- 
gests that they both carry a positive charge (Extended Data Fig. 5e). 
In addition, the inhibitory factor of BPscsx and nisin-A are resistant 
to heat and proteases, a characteristic of lantibiotics (Extended Data 
Fig. 5f). Methods to edit the genome of Blautia producta are lack- 
ing, so we pursued a gain-of-function approach and heterologously 
expressed lanA-lanA, in Escherichia coli (Extended Data Fig. 6a, b, 
Supplementary Table 6), purified the lantibiotic to homogeneity, and 
validated it by mass spectrometry (Extended Data Fig. 6c). VRE was 
similarly inhibited by the addition of the purified BPscsx LanA or com- 
mercial nisin-A (Fig. 1c). 

RNA sequencing of caecal contents from antibiotic-treated mice 
colonized with CBBPscsx (Fig. 1d) demonstrated that, relative to the 
overall transcriptome of BPscsx, precursor lantibiotic transcripts and 
associated immunity genes were abundant (greater than the ninety-fifth 
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from BPgcgx (100 1M), or PBS (n = 4 biologically independent samples 
from two independent experiments). VRE CFUs were enumerated 8 h 
after inoculation. d, RNA sequencing analysis was performed on caecal 
content from mice treated with CBBP (n = 3 mice from one independent 
experiment). RPKM, reads per kilobase of transcript per million mapped 
reads. VRE (ATCC 700221) was used in experiments in a—c. *P = 0.0286, 
PD < 00001, two-tailed Mann-Whitney U-test for comparisons with 
negative control (PBS, VRE culture alone). Data are mean + s.d. (a, c), 
median + range (b) and median values (d). 


percentile), whereas genes involved in post-translational modification 
of the precursor lantibiotic were expressed to a lesser degree. Oral 
administrations of proteins precipitated from BPgcgx but not BP control 
cultures reduced VRE colonization in antibiotic-treated mice chal- 
lenged with VRE (Extended Data Fig. 7), albeit to a lesser degree than 
CBBPscsx administration. This probably reflects reduced concentra- 
tions of lantibiotic owing to intestinal absorption, metabolism and 
intermittent administration. These findings demonstrate that BPscsk 
encodes a lantibiotic that is highly expressed and inhibits VRE in vivo. 

L. lactis is a lantibiotic-producing probiotic that, theoretically, could 
be used to reduce VRE colonization. VRE is inhibited after co-culture 
with BPscsx or L. lactis (Fig. 2a) and after exposure to precipitated pro- 
teins from either species (Extended Data Fig. 8a). By contrast, in vivo 
VRE colonization was inhibited by CBBPscsx but not when BPscsx was 
replaced by L. lactis (CLBP) (Fig. 2b). Although BP gcgx is prevalent in 
the microbiota after CBBPscsx treatment (relative abundance > 25%), 
L. lactis was not detected after CLBP treatment (Fig. 2c). The failure 
of L. lactis to colonize the intestine probably explains its inability to 
reduce VRE density in vivo; L. lactis also does not colonize the porcine 
intestine or inhibit Listeria monocytogenes or C. difficile in a haman 
distal-colon model”. 

To characterize the antibacterial spectrum of the BPscgx lantibi- 
otic, we cultured 152 commensal strains obtained from human fae- 
ces (Supplementary Table 7) with protein precipitated from BPscsk 
or BP.ontrot Cultures or broth spiked with nisin-A diluted to the same 
minimal inhibitory concentration (MIC) against VRE as BPscsx. The 
MIC was determined as the highest dilution that inhibited growth 
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Fig. 2 | BPscsx colonizes the gastrointestinal tract and broadly inhibits 
Gram-positive pathogens while preserving some commensal species. 
a, VRE was co-cultured in vitro with L. lactis or BPscsx (n = 9 biologically 
independent samples from three independent experiments) and growth 
was monitored. b, Antibiotic-treated, VRE-dominated mice (n = 12 
mice from three independent experiments) received an oral gavage 
that contained CBBPscsx, CLBP, or CBBP contro}. WRE colonization was 
monitored by CFU quantification in faecal samples. c, The microbiota 
composition determined by metagenomic sequencing of 16S rRNA 
genes from faecal samples collected from mice treated with CBBPscsx or 
CLBP. d, Culture broth was conditioned with proteins precipitated from 
BPscsx> BPcontro! Of commercial nisin-A and serially diluted. The MIC 


over 24 h. Protein precipitates from BPscsx-conditioned or nisin- 
A-spiked media, but not BP ontroi-conditioned media, inhibited 
Gram-positive, but not Gram-negative, bacterial strains. A resist- 
ance index that compares the MIC values in conditioned media from 
BP control to that from BPscsx or nisin-A was used to quantify the sen- 
sitivity of bacterial strains to both lantibiotics. The Gram-positive 
population demonstrated greater sensitivity to nisin-A-spiked media 
than BPscsx-conditioned media (Fig. 2d). Several VRE strains and 
other Gram-positive nosocomial pathogens (Extended Data Fig. 8b) 
demonstrate comparable sensitivity to either conditioned media, but 
many Gram-positive commensal strains were more resistant to the 
BPscsx lantibiotic than nisin-A, including members implicated in 
resistance to intestinal infections, such as Bifidobacterium longum 
and Pediococcus acidilactici””** (Extended Data Fig. 8c). Thus, the 
BPscsx lantibiotic, relative to nisin-A, has a narrower spectrum of 
activity that targets VRE while preserving commensal bacteria. 
Among 32 Blautia isolates cultured from faecal samples from 
healthy donors, BPscsx was the only strain that encoded a lantibiotic 


was determined for 158 strains from a commensal biobank by calculating 
the highest dilution factor that inhibited growth (n = 2 biologically 
independent samples from two independent experiments). The resistance 
index is a ratio between MIC of BP eontroj-conditioned media to the MIC 
of BPscsx or nisin-A-conditioned media. VRE (ATCC 700221) was used 
for experiments in a-c. ***P < 0.001, ****P < 0.0001, two-tailed Mann- 
Whitney U-test for comparisons with negative control (a, b) or between 
experimental conditions (d). Data are median values (a) or mean + s.d. 
(b). For the box plots in d, centre line denotes the median, box limits 
represent the upper and lower quartiles, and errors denote 

1.5x the interquartile range. 


and inhibited VRE in vitro (Fig. 3a, Supplementary Tables 1, 2). To 
determine the prevalence of lantibiotic genes in the human intestinal 
microbiome, we shotgun-sequenced faecal samples collected from 15 
healthy donors and identified lantibiotic genes and homologues that 
contain the gallidermin superfamily domain (Extended Data Fig. 9a) in 
7 of 15 samples, with different sequences within and between samples 
(Fig. 3b, Extended Data Fig. 9b). 

We next mined the genomes of commensal biobank isolates for the gal- 
lidermin superfamily domain and identified one additional Clostridiales 
species, Ruminococcus faecis, which encodes a similar lantibiotic and 
inhibits VRE in vitro, whereas R. faecis strains that did not encode a lanti- 
biotic did not inhibit VRE (Fig. 3c, Extended Data Fig. 9c, Supplementary 
Table 1). Although only a minority of cultured commensal bacteria 
encodes lantibiotics, it remains unclear whether this reflects their paucity 
in the microbiota or their relative resistance to in vitro culture. 

Patients undergoing allogeneic haematopoietic cell transplantation 
frequently have intestinal domination by VRE'?°. From a biobank of 
longitudinally collected faecal samples, we identified 238 samples from 
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Fig. 3 | Lantibiotic genes are present in human microbiomes of healthy 


individuals and gut resident, lantibiotic-producing species inhibit 
VRE. a, Microbiota-derived Blautia species were whole-genome 
sequenced, assembled, annotated and mined for lantibiotic precursor 


assembled, annotated and mined for lantibiotic precursor sequences 


to identify a strain of R. faecis that encoded a homologous lantibiotic. 
VRE was inoculated in conditioned media from three strains of R. faecis 
cultures (n = 4 biologically independent samples from four independent 


sequences. VRE was inoculated in conditioned media from 39 strains 

(n = 4 biologically independent samples from four independent 
experiments) and monitored for growth. b, Lantibiotic detection from 
shotgun sequencing of human faecal samples (n = 15 faecal samples). 

c, In total 421 commensal biobank isolates were whole-genome sequenced, 


22 patients with a range of E. faecium densities and found lantibiotic 
gene abundance inversely correlated with the relative abundance of 
E. faecium (Spearman correlation coefficient = —0.43, P=2.08 x 107") 
(Fig. 4a). Samples with high lantibiotic abundance (Lanbig’ > g5th 


experiments) with or without detected lantibiotic genes, and VRE growth 
was monitored 8 h after inoculation. VRE (ATCC 700221) was used in 
experiments in a and c. *P = 0.0286, two-tailed Mann-Whitney U-test for 
comparisons with negative control. Data are mean + s.d. (a, c). 


percentile) consistently had low abundance of E. faecium (<10% 16S 
relative abundance), and were detected in half of patients (Extended 
Data Fig. 10). In Lanbi8* and Lan’ settings, 25% and 21%, respec- 
tively, had high microbiota diversity (inverse Simpson index > 8) 
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Fig. 4 | Enrichment of lantibiotic genes correlates with reduced 

E. faecium in patient faecal samples. a, b, Longitudinally collected 
faecal samples (n = 238 biologically independent samples) from 

22 patients undergoing allogeneic haematopoietic cell transplantation 
were shotgun sequenced. a, The relative abundance of E. faecium 
determined by 16S rRNA was plotted against lantibiotic gene abundance 
(Spearman correlation coefficient = —0.43, P = 2.08 x 10~!°). b, Samples 
were then stratified by abundance of the lantibiotic, and the relative 
abundance of E. faecium was plotted against microbiota a diversity. The 
percentage of sample distribution is shown in each quadrant. c, Faecal 
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microbiota transplants were performed on germ-free mice using diversity- 
matched microbiomes containing either high or low lantibiotic gene 
abundance. One week after FMT administration, mice were orally gavaged 
with VRE and colonization was monitored by quantifying VRE from 
faecal samples. VRE (ATCC 700221) was used for experiments in c. Low 
lantibiotic abundance < 2?° < high lantibiotic abundance (RPKM); low 
E. faecium abundance < 10 < high E. faecium abundance (% relative 16S); 
low a diversity < 8 < high a diversity (inverse Simpson index). *P < 0.05, 
***P < 00001, two-tailed Mann-Whitney U-test. 


and low E. faecium abundance, which suggests that diversity com- 
pensates for low lantibiotic-gene abundance by parallel, lantibiotic- 
independent inhibitory mechanisms (Fig. 4b). However, nearly half 
of the Lan!” samples with low diversity (inverse Simpson index < 8) 
had high E. faecium abundance (>10% 16S relative abundance); 
low diversity decreases the likelihood, but some commensal species 
still provide lantibiotic-independent colonization resistance against 
E. faecium. By contrast, Lan’ samples had low E. faecium abundance 
(P<1x 107°) despite low diversity, consistent with the notion that 
lantibiotic gene abundance in the microbiome contributes to coloni- 
zation resistance against E. faecium. 

To determine whether low-diversity Lan™®" microbiomes can resist 
VRE colonization, we identified diversity-matched Lan™?" and Lan!” 
samples and colonized germ-free mice before VRE challenge (Fig. 4c, 
Supplementary Table 8). Regardless of diversity, Lan™" samples con- 
sistently reduced VRE colonization compared with Lan!” samples, 
which suggests that lantibiotics in the gastrointestinal tract provide 
colonization resistance. 

Microbiota-mediated colonization resistance remains incompletely 
defined and restoring resistance during antibiotic-induced dysbi- 
osis remains an important goal. BPscsx belongs to a small subset of 
commensals that secrete lantibiotics, and therefore can influence the 
community structure of the microbiota. A potential clinical role for 
lantibiotics is supported by a previous report that uses lantibiotic- 
producing commensal Staphylococcus species on the skin to provide 
colonization resistance against Staphylococcus aureus*°. Understanding 
the mechanisms by which the microbiota confers colonization resist- 
ance may lead to the development of therapies to repair dysbiosis, 
thereby reducing susceptible patients’ risk of colonization by antibi- 
otic-resistant pathogens. 
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METHODS 


Bacterial strains. Vancomycin-resistant E. faecium purchased from ATCC 
(stock number 700221) was used for all experiments unless otherwise stated. 
Vancomycin-resistant E. faecalis strains used were V583 and MH.SK1. Listeria 
monocytogenes strains used were 10403S and 13932. Salmonella Typhimurium 
strains used were SL1344 and LT2. The following strains were isolated from 
patients at Memorial Sloan Kettering Cancer Center: vancomycin-resistant 
E. faecium strains MH.0139G, MH.0151E, MH.1107 and MH.1326H; vancomycin- 
resistant E. faecalis strain MH.SK1; C. difficile strain MH.BBL2; methicillin-resistant 
S. aureus strains MH.SK1 and MH.SK2; Klebsiella pneumoniae strains MH189 and 
MH258; E. coli strains MH.T18 and MH.X43; and Proteus mirabilis strains MH.42F 
and MH.43A. All gut commensal strains used were isolated from faecal samples of 
healthy donors and are listed in Supplementary Table 7. 

Mouse husbandry. All experiments using wild-type mice were performed with 
C57BL/6] female mice that were 6-8 weeks old; mice were purchased from Jackson 
Laboratories. Rag2~/~Il2rg~/~ mice were purchased from Taconic Farms, and sub- 
sequently bred in-house. Germ-free mice were bred in-house in germ-free isolators. 
All mice were housed in sterile, autoclaved cages with irradiated food and acidified, 
autoclaved water. Mouse handling and weekly cage changes were performed by 
investigators wearing sterile gowns, masks and gloves in a sterile biosafety hood. 
All animals were maintained in a specific-pathogen-free facility at Memorial 
Sloan Kettering Cancer Center Animal Resource Center. After co-housing 
for at least two weeks, mice were individually housed and randomly assigned 
to experimental groups. All animal experiments were performed at least three 
times unless otherwise noted. Experiments were performed in compliance with 
Memorial Sloan-Kettering Cancer Center institutional guidelines and approved 
by the institution’s Institutional Animal Care and Use Committee. 

Mouse antibiotic administration. Mice were administered ampicillin (0.5 g 1, 
Fisher Scientific) in the drinking water for 5 days. Ampicillin was changed every 
3 days. Antibiotic administration ceased after the initial administration of com- 
mensal bacteria (after 5 days) unless stated otherwise. 

Bacterial in vitro broth culture conditions. The culture broth used for all cul- 
tures was pre-reduced brain heart infusion broth supplemented with yeast extract 
(5g 1} and L-cysteine (1 g 1!). The culture conditions were 37 °C and anaerobic 
unless otherwise stated. 

VRE CFU enumeration. VRE CFUs were enumerated from samples by serial 
dilution in PBS and plating on BD Enterococcosel agar supplemented with van- 
comycin (8 pg ml~'; Novagen) and streptomycin (100 jg ml”; Fisher Scientific). 
VRE in vitro co-culture inhibition experiments. A frozen aliquot of each bac- 
terial strain was inoculated and cultured in broth for 24 h. The resulting cultures 
were plated as lawns on pre-reduced Columbia agar supplemented with 5% sheep 
blood and cultured anaerobically at 37°C for 24 h, collected and resuspended in 
pre-reduced PBS (108 CFUs ml“). Using these stocks, VRE (10° CFUs ml~!) was 
co-cultured with each candidate bacterium (10° CFUs ml~') for 6, 24 and 48 h. 
VRE CFUs were enumerated at each time point. The co-cultured candidate bac- 
terium CFUs were enumerated at each time point by anaerobically plating serial 
dilutions of the culture on pre-reduced Columbia agar supplemented with 5% 
sheep blood and calculating the difference from the enumerated VRE CFUs. 
VRE in vitro supernatant inhibition experiments. A frozen aliquot of each bac- 
terial strain was inoculated and cultured for 24 h. Culture supernatant was col- 
lected by centrifugation at 8,000g for 5 min and subsequent filtration (0.22 jm). 
Supernatants were diluted 1:2 with culture broth. VRE was subsequently inoculated 
(10° CFUs ml~!) and cultured for 6, 24 and 48 h. VRE CFUs were enumerated at 
each time point. 

Fluorescence in situ hybridization. Intestinal tissues with luminal contents 
were carefully excised and fixed in freshly made nonaqueous Methacarn solu- 
tion (60% methanol, 30% chloroform and 10% glacial acetic acid) as previously 
described?!” for 6 h at 4°C. Tissues were washed in 70% ethanol, processed 
with Leica ASP6025 processor (Leica Microsystems) and paraffin-embedded 
by standard techniques. Subsequently, 5-j1m sections were baked at 56°C for 1 
h before staining. Tissue sections were deparaffinized with xylene (twice, 10 min 
each) and rehydrated through an ethanol gradient (95%, 10 min; 90%, 10 min) 
to water. Sections were incubated with a probe specific to BPscex ([Alexa546]- 
TATAAGACTCAATCCGAAGAGATCAT-[Alexa546]) at 50°C for 3 h. Probes 
were diluted to 5 ng ul? in 0.9 M NaCl, 20 mM Tris-HCl at pH 7.2 and 0.1% SDS 
before use. Sections were later washed twice in 0.9 M NaCl, 20 mM Tris-HCl] at 
pH 7.2 (wash buffer) for 10 min and counterstained with Hoechst (1:3,000 in wash 
buffer) for nuclear staining. 

VRE in vivo decolonization experiments. Antibiotic-treated mice or germ-free 
mice were orally gavaged with VRE (10* CFUs in 200 jl PBS). Three days after 
VRE inoculation, mice were orally gavaged with isolates from a candidate bacteria 
consortium (10° CFUs per isolate in 200 \l PBS) or vehicle (PBS). VRE coloniza- 
tion was monitored by enumerating VRE CFUs from faecal pellets at the stated 
time points. Faecal pellets were resuspended in PBS to a normalized concentration 


(100 mg ml~') for VRE CFU enumeration. Mice were screened for pre-existing VRE 
colonization by selective plating before proceeding forward with all experiments. 
VRE ex vivo inhibition experiments. Antibiotic-treated mice were orally gavaged 
with isolates of a given bacteria consortium (10° CFUs per isolate), vehicle (PBS), 
or VRE (10° CFUs) in 200 j1l PBS. Seven days after inoculation, the content from 
the caecum was obtained and resuspended in pre-reduced PBS to a normalized 
concentration (100 mg ml~!). Supernatant from caecal content suspensions were 
collected by centrifugation at 8,000g for 5 min and subsequent filtration (0.22 1m). 
VRE (103 CFUs ml~!) was inoculated and cultured anaerobically at 37°C for 6 h, 
and VRE CFUs were enumerated. 

Ammonium sulfate precipitation experiments. A frozen aliquot of each bacte- 
rium was inoculated and cultured to late log phase at 37°C unless stated otherwise. 
Lactococcus lactis was cultured to late log phase at 25°C. The resulting culture 
supernatants were collected by centrifugation at 8,000g for 5 min and subsequent 
filtration (0.22 jum). To collect 0-45% fractions, ammonium sulfate was added to 
45% saturation and equilibrated overnight stirring at 4°C. The precipitate was 
collected by centrifugation at 20,000g for 20 min, dissolved in PBS, and dialy- 
sed (molecular mass cut-off of= 3 kDa) against PBS overnight at 4°C. To collect 
45-90% fractions, ammonium sulfate was added to 90% saturation to the 0-45% 
fraction supernatants. The precipitate was collected as described for 0-45% frac- 
tions. Total protein concentrations were normalized (2 mg ml~!) and diluted in 
culture broth (20 jg ml~'). VRE was inoculated (10? CFUs ml!) and cultured for 
6 and 24 h. VRE CFUs were enumerated at each time point. 

Lantibiotic gene expression in vivo experiments. Antibiotic-treated mice were 
orally gavaged with CBBP (10° CFUs per isolate in 200 11). Two weeks after inoc- 
ulation, the content from the caecum was obtained and flash-frozen. The samples 
were subsequently RNA extracted, sequenced, and analysed as described below. 
Construction of pRSFDuet-1/LanA+LanB and pCDF-1/LanC. Construction of 
expression vectors was based on previous methodology***. Custom gene synthesis 
of modified fragments of pRSFDuet-1 and pCDF-1 were generated (IDT) in which 
the precursor sequence LanA and the dehydratase LanB were inserted into multiple 
cloning site (MCS) 1 and MCS 2, respectively, in pRSFDuet-1; the cyclase LanC 
was inserted into MCS 2 in pCDF-1. The respective vector backbones, excluding 
the regions analogous to the modified gene fragments containing lantibiotic gene 
inserts described earlier, were linearized by inverse PCR amplification using linear_ 
pDuet-1F (5/-CGAGTCTGGTAAAGAAACCGCTG-3’) and linear_pRSFDuet-1R 
(5'‘-GATCCTGGCTGTGGTGATGATGGT-3’) for pRSFDuet1, and linear_pDu- 
et-1F and linear_pCDFDuet-1R (5‘-TTCTTATACT TAACTAATATACTAA-3’) 
for pCDFDuet-1. The modified gene fragments containing the inserts 
were PCR amplified. For pRSFDuet-1 inserts, the first fragment, pRSF- 
Duet-1.MCS1-gblock, was amplified using gblock_pRSFDuet-1.MCS1F 
(5'-ACCATCATCACCACAGCCAGGAT-3’) and gblock_pRSFDuet-1.MCS1R 
(5‘-AAAAACTTTTGTAAATCGAATACTGATTTCTTCTGC-3’). The sec- 
ond fragment, pRSFDuet-1.MCS2-gblock, was amplified using gblock_pRS- 
FDuet-1.MCS2F (5/-AGAAATCAGTATTCGAT-3’) and gblock_pDuet-1. 
MCS2R (5‘-AGCAGCGGTTTCTTTACCAGACTCG-3’). For the pCDFDuet 
insert, the gene fragment was amplified using gblock_pCDFDuet-1.MCS2F 
(5'-TTAGTATATTAGTTAAGTAT-3’) and gblock_pDuet-1.MCS2R. The gene 
fragments were cloned into the linearized vector backbones using In-Fusion HD 
Cloning Plus (Takara). Stellar competent cells (Takara) were transformed with 
the fused vectors by heat shock and plated on selective plates at 37°C for 16 h. 
The pRSFDuet-1/LanA+LanB transformants were selected on luria broth (LB) 
agar supplemented with kanamycin (30 jg ml), and pCDFDuet-1 transformants 
were selected on LB agar supplemented with streptomycin (50 jig ml~') for pCD- 
FDuet-1/LanC. Colonies containing each vector were inoculated in LB supple- 
mented with the respective antibiotics for selection and cultured for 10 h at 37°C, 
followed by isolation of the plasmids using a Qiaprep Spin Miniprep Kit (Qiagen). 
The sequences of the resulting plasmids were confirmed by DNA sequencing. The 
sequences of the lantibiotic genes are listed in Supplementary Table 4, and the 
custom gene fragment sequences are listed in Supplementary Table 6. 
Overexpression and purification of lantibiotic. These were performed as pre- 
viously described****. In brief, chemically competent BL21(DE3) cells were co- 
transformed with pRSFDuet-1/LanA+LanB and pCDFDuet-1/LanC. Overnight 
cultures grown from a single colony transformant were used as an inoculum for 
larger scale cultures in terrific broth medium containing 30 mg!~' kanamycin and 
50 mg 1! streptomycin at 37°C until the OD¢00 nm reached between 0.6 and 0.8. 
The cultures were then induced with 1 mM IPTG and incubated at 18°C for an 
additional 16 h. The cells were collected by centrifugation at 8,000g for 15 min. 
The cell pellets corresponding to 1.5 1 of culture were resuspended in 45 ml of 
start buffer (20 mM Tris, 500 mM NaCl, 10% glycerol, protease inhibitor cock- 
tail from Roche, pH 8.0). The suspensions were chilled on ice and lysed using a 
Branson ultrasonic homogenizer (35% amplitude, 10-s pulse, 10-s pause for total 
10 min). The lysate supernatant was collected by centrifugation at 30,000g for 
30 min at 4°C. Chromatographic purification was performed using an AKTA pure 


chromatography system at 4°C. The lysate supernatant was loaded onto a HiTrap 
HP nickel affinity column. The column was washed with 75 ml (start buffer plus 
30 mM imidazole) and recombinant product eluted in 25 ml (start buffer plus 
1 M imidazole). The His-tagged lanthipeptide eluate was loaded on a Luna 10 pm 
C8(2) 100 A, LC Column 250 x 4.6 mm and separated with 80 min linear gradient 
of 0-80%. Buffer A was 0.1% TFA in H,O and buffer B was 90% acetonitrile, 20% 
buffer A. The LanA;-LanA, peptide itself and its hydration + 18 Da series eluted 
in fractions 40-50 (Extended Data Fig. 6b) with the maximum for fully dehydrated 
product at 45%. Fractions 43-46 were lyophilized and the concentration of the 
solution was measured by BCA assay. We obtained approximately 1 mg of product 
from bacteria in 1.3 | of medium. The His tag and leader sequence were removed 
by trypsin digestion for 2 h at 25°C. The digestion was stopped by adding formic 
acid to 1% and the product was separated by reverse phase chromatography on 
a 0-80% linear gradient as described above. The resulting product was checked 
by electrospray ionization—mass spectrometry and the spectrum was deisotoped 
and deconvoluted by Xtract algorithm in Xcalibur. The proteolytic fragment cor- 
responding to mature LanA,-LanAy was observed: 3,152.45. VRE was inoculated 
in culture broth supplemented with the purified lantibiotic (100 1M) and cultured 
for 24h. VRE CFUs were subsequently enumerated. 

DNA extraction. DNA was extracted using a phenol-chloroform extraction tech- 
nique with mechanical disruption (bead beating). In brief, a frozen aliquot of approxi- 
mately 100 mg per sample was suspended, while frozen, in a solution containing 500 11 
of extraction buffer (200 mM Tris, pH 8.0; 200 mM NaCl; and 20 mM EDTA), 210 jl 
of 20% SDS, 500 j1l of phenol:chloroform:isoamy] alcohol (25:24:1), and 500 ul of 
0.1-mm-diameter zirconia/silica beads (BioSpec Products). Microbial cells were lysed 
by mechanical disruption with a bead beater (BioSpec Products) for 2 min, followed 
by two rounds of phenol:chloroform:isoamy] alcohol extraction. After extraction, 
DNA was precipitated in ethanol, resuspended in 200 jl of TE buffer with RNase 
(100 mg ml~ 1), and further purified with QIAamp mini spin columns (Qiagen). 
Microbial composition by 16S rRNA gene sequencing. Universal bacterial 
primers—563F (5’-nnnnnnnn-NNNNNNNNNNNN-AYTGGGYDTAAA- 
GNG-3’) and 926R (5’-nnnnnnnn- NNNNNNNNNNNN-CCGTCAATTYHT- 
TTRAGT-3), in which ‘N’ represents unique 12-base-pair Golay barcodes and ‘Y 
represents additional nucleotides to offset the sequencing of the primers—were 
used to PCR-amplify the V4-V5 hypervariable region of the 16S rRNA gene. The 
V4-V5 amplicons were purified, quantified, and pooled at equimiolar concentra- 
tions before ligating Illumina barcodes and adaptors using the Iumina TruSeq 
Sample Preparation protocol. The completed library was sequenced using the 
MiSeq Illumina platform*>. Paired-end reads were merged and demultiplexed. 
The UPARSE pipeline” was used for error filtering using the maximum expected 
error (Emax = 1)°”, clustering sequences into operational taxonomic units (OTUs) 
of 97% distance-based similarity, and identifying and removing potential chimeric 
sequences using both de novo and reference-based methods. Singleton sequences 
were removed before clustering. A custom Python script incorporating nucleo- 
tide BLAST, with NCBI RefSeq?* as reference training set, was used to perform 
taxonomic assignment to the species level (E < 1 x 107!) using representative 
sequences from each OTU. 

Whole-genome sequencing, assembly and annotation. An overnight culture 
grown from a single colony in culture broth was DNA extracted and sequenced 
using the Illumina MiSeq platform. Purified DNA was sheared using a Covaris 
ultrasonicator and prepared for sequencing with a Kapa library preparation kit 
with [lumina TruSeq adaptors to create 300 x 300 bp nonoverlapping paired-end 
reads. Raw sequence reads were filtered (Phred score > 30, 4 bp sliding window) 
using Trimmomatic*® (v.0.36). Trimmed reads were assembled into contigs and 
annotated with putative open reading frames using the assembly and annotation 
services in PATRIC™ (v.3.5.25). 

Metagenomic sequencing. DNA was extracted, sheared, and libraries were pre- 
pared as described for whole-genome sequencing. Sequencing was performed 
using the Illumina HiSeq platform (Illumina) with a paired-end 100 x 100 bp kit 
in pools targeting 20-30 million reads per sample. 

RNA extraction. Samples were extracted using an acidic phenol-chloroform 
protocol. In brief, approximately 100 mg per sample was suspended in 700 \1l of 
RNA. The suspension was homogenized using a sterile RNase-free spatula and 
incubated at 4°C overnight. Samples were pelleted by centrifugation at 13,000g for 
10 min and resuspended in 200 il of RNA extraction buffer supplemented with 
proteinase K (1 mg ml!) that was heat-activated at 50°C for 10 min. Samples 
were incubated at room temperature for 10 min and vortexed every 2 min. Then, 
300 jul of Qiagen RLT Plus Buffer (Qiagen) with 8-mercaptoethanol (1%) was 
added to each sample, vortexed and incubated for 5 min at room temperature. 
Samples were then transferred to a sterile bead beating tube with 500 iil of 0.1 mm 
glass beads and 500 il of acidic phenol:chloroform:isoamyl. Mechanical lysis was 
performed by bead beating the samples for 3 min (BioSpec Products), followed 
by one round of acidic phenol-chloroform extraction and one round of chloro- 
form extraction. RNA was precipitated with 50 il of 3 M ammonium acetate and 
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500 11 of 100% isopropanol and incubated at —20°C overnight. RNA was pelleted 
by centrifugation at 13,000g for 20 min at 4°C and washed with 450 1l of 70% 
ethanol. Ethanol wash was repeated, and the pellet was allowed to air dry at room 
temperature for 5 min. The pellet was then dissolved in 50 1l of RNase-free water. 
RNA samples were purified using RNAClean XP (Agencourt), DNA contaminants 
were removed using TURBO DNA-Free kit (Life Technologies), and ribosomal 
RNA removed using Ribo-Zero rRNA Removal Kit (Illumina). Following riboso- 
mal RNA depletion, RNAClean XP purification was repeated. 

RNA sequencing and analysis. RNA sample libraries were prepared using the 
TruSeq Stranded mRNA protocol (Illumina) and sequenced using the Illumina 
Miseq platform (Illumina). Raw sequence reads were filtered using Trimmomatic 
(v.0.36), aligned to the genome of BPscsx using bowtie2 (v.2.3.4.1), assigned to 
genes using featureCounts (v.1.6.1), and converted to normalized gene counts using 
DeSeq2 (v.1.20.0). 

Oral administration of BPscsx protein precipitate. Antibiotic-treated mice were 
orally gavaged with BPscsx or BPcontro! protein precipitate (400 j1g). Three hours 
later, VRE (10 CFUs in 200 jl PBS) was orally gavaged, followed by oral administra- 
tions of BPscsx or BP control protein precipitate every 3 h for 12 h. VRE colonization 
was monitored by enumerating VRE CFUs from faecal pellets 12 h post- VRE- 
gavage. Faecal pellets were resuspended in PBS to a normalized concentration 
(100 mg ml“) for VRE CFU enumeration. Mice were screened for pre-existing VRE 
colonization by selective plating before proceeding forward with all experiments. 
Healthy-donor faecal isolate collection. Faecal samples were collected from 
healthy human donors (n = 15) and transferred to an anaerobic chamber 
within 1 h of collection. All culture conditions were performed anaerobically 
on pre-reduced Columbia agar supplemented with 5% sheep blood at 37°C. 
Samples were resuspended in pre-reduced PBS and serially diluted with three 
tenfold serial dilutions. The dilutions were streaked on plates and cultured for 
72 h. Individual colonies were selected and streaked onto fresh plates and cul- 
tured for 48 h. Single colonies were then resuspended in 50 1l of pre-reduced 
PBS and 10 j1l was streaked as a lawn onto a fresh plate and cultured for 48 h. 
Each isolate was obtained from culture and stocks were stored in pre-reduced 
PBS with 10% glycerol at 80°C. Colony PCR was performed using 2 11 of the 
above 50 1] single-colony suspension in PBS as a template. The 16S rRNA gene 
was amplified with primers 8F (5'’-AGAGTTTGATCCTGGCTCAG-3’) and 
1492R (5'-GGTTACCTTGTTACGACTT-3’). Amplicons were purified with the 
Qiaquick PCR Purification Kit (Qiagen) and sanger sequenced (Eton Biosciences) 
with a panel of 6 primers: 8F (5/-AGAGTTTGATCCTGGCTCAG-3’), 533F 
(5’-GTGCCAGCAGCCGCGGTAA-3’), 168.1100.F16 (5’-CAACGAGCGC 
AACCCT-3’), 1492R (5'-GGTTACCTTGTTACGACTT-3’), 907R (5’-CCGTC 
AATTCMTTTRAGTTT-3’), 519R (5’-GWATTACCGCGGCKGCTG-3’). Sanger 
sequences were quality filtered and assembled into a consensus sequence using cus- 
tom Python scripts. Species identification was performed with nucleotide BLAST 
against the NCBI RefSeq database. 

Patient stool collection. Patients were enrolled in a prospective faecal collection 
protocol, in which faecal samples were routinely collected during the initial trans- 
plant hospitalization and stored in a biospecimen bank, as described previously’. 
Patients were part of a study consisting of adult patients (>18 years) undergoing 
allogeneic haematopoietic stem-cell transplantation at Memorial Sloan Kettering 
Cancer Center (MSKCC). The study was approved by the Institutional Review 
Board at MSKCC. All study patients provided written informed consent for IRB- 
approved biospecimen collection and analysis (protocols 09-141, 06-107). The 
study was conducted in accordance with the Declaration of Helsinki. 
Lantibiotic gene mining. The lantibiotic genes were discovered in the genome 
of BPscsx using antiSMASH*! and BAGEL3” and confirmed to be homologous 
to known lantibiotic gene sequences using BLASTp alignment (Supplementary 
Table 5). Lantibiotic sequences were identified from metagenomic sequences using 
DIAMOND (v.0.9.22)* to align reads (E < 0.001) to a custom database derived 
from the RefSeq nonredundant database (accessed August 2018), filtering only for 
lantibiotic genes containing the gallidermin superfamily domain. To identify RefSeq 
entries containing the gallidermin superfamily domain, a hidden Markov model 
profile was built according to the NCBI Conserved Protein Domain Family entry for 
the gallidermin superfamily domain (accession cl03420) by using pfam02052 and 
TIGRO3731 hidden Markov model files and searching for RefSeq entries with these 
sequence patterns using HMMER (3.1b2)*4 (E < 1 x 107). Lantibiotic sequences 
were identified from whole-genome-sequenced genomes by assembling and anno- 
tating genomes as described previously. All open reading frames were searched 
for homology to the gallidermin superfamily domain using HMMER (3.1b2)“. 
Detected lantibiotic sequence assembly from metagenomic sequencing. 
Translated sequencing reads aligning to a RefSeq database entry were retrieved 
from the DIAMOND (v.0.9.22) alignment output and sorted by the RefSeq entry 
sequence they aligned. All sequencing reads within a sorted group were multi- 
ple sequence aligned to each other using MUSCLE (v.3.8.31) and the consensus 
sequence was used as the assembled, detected lantibiotic sequence. 
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Statistics. Statistical analyses were performed using R (v.3.3.1) and GraphPad 
Prism (v.7.0a) software packages. The two-tailed Mann-Whitney U-test was used 
for comparisons of continuous variables between two groups with similar vari- 
ances. No statistical methods were used to predetermine sample size. When pos- 
sible, investigators were blinded during group allocation and outcome assessment 
(16S and metagenomic shotgun sequence collection, extraction, quantification 
and analysis; enumeration of VRE in animal, ex vivo and in vitro experiments). 
Data were visualized using bar plots with centre values representing the geomet- 
ric mean and error bars representing the geometric s.d.; line graphs with points 
representing the geometric mean and error bars representing the geometric s.d.; 
box plots with the centre line representing the median, box limits representing 
the upper and lower quartiles and whiskers representing 1.5x the interquartile 
range; and heat maps with individual values contained in a matrix representing the 
mean. Spearman rank correlation tests (two-tailed) were used to find significant 
correlations between two continuous variables. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 
Microbiome sequencing data are available from Bioproject with the accession 
number 394877. 
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Extended Data Fig. 2 | See next page for caption. 


Extended Data Fig. 2 | BPscsx, but not BP contro, reduces VRE 
colonization in vivo. a, b, Faecal samples collected from antibiotic- 
treated, VRE-dominated mice (n = 4 mice from one independent 
experiment) orally gavaged with CBBPscsx (a) or CBBPeontrot (b) were 
shotgun sequenced and the relative abundance of each species was 
determined by 16S rRNA. ¢, d, Antibiotic-treated (c) or germ-free (d) mice 
(n = 8 mice from two independent experiments) were orally gavaged 

with VRE. Three days later, VRE-dominated mice received an oral gavage 
of CBBPscsx or CBBPontro! and VRE colonization was monitored by 
quantifying VRE in faecal samples. e-g, Antibiotic-treated mice (n = 4 


LETTER 


mice from one independent experiment) were orally gavaged with 
different strains of clinical VRE isolates. Three days later, VRE-dominated 
mice received an oral gavage of CBBPscsx or CBBP control and VRE 
colonization was monitored by quantifying VRE in faecal samples. The 
following VRE strains were used: strain 0151F is an E. faecium MLST type 
ST80 (e); strain 1107 is an E. faceium MLST type ST412 (f); strain V583 is 
an E. faecalis strain (g). VRE strains used were VRE (ATCC 700221) (a-d), 
VRE (0151F) (e), VRE (1107) (f), and VRE (V583) (g). Data are mean 

+ s.d.*P < 0.05, ***P < 0.001, two-tailed Mann-Whitney U-test. 
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Extended Data Fig. 3 | BPscsx colonizes the large intestine. Antibiotic- visualized by fluorescence in situ hybridization. Entire caecum cross- 
treated mice (n = 5 mice from one independent experiment) were orally sections were hybridized with a probe specific for BPscsx. Sections were 
administered CBBPscsx. Two weeks later, BPscsx localization around counterstained with Hoechst dye to visualize the nuclei. Representative 
the mucosal epithelium (top) and lumen (bottom) of the caecum were images are shown. Scale bars, 25 um. 


LETTER 


b 
8 
> c 
& 7 
a S 
& 6 PBS CBBP 5.5, wild type 
ex vivo cecal supernatant cultures ce] 
10% § 4 15kDa Regllly 
a 
on 
3 
10° 5 
a 2 
@ 
10° 5 1 
0 
a 107 
£ “| PBS 
ae ™ cBBP, 
S10 [5 wild type. 
@ 
10° 
= d e f 
5 , Defensin, alpha related sequence 1 
104 Angiogenin (Ang4) - (Deft) 4 Amphiregulin (Areg) 
8 8 s —————“|] 
3 F > ones 
10 VRE inoculum % 7 8 7 5 7 
oO oO 7 
10? LOD & : $ 3 : p=02 
25 25 x 5 ns 
S 4 & 4 2 4 | p>0.9999 
% g & ts 
o 3 o 3 o 8 
2 a 2 
é 2 zg 2 s 2 
5 1 e 1 “4 
0 e 4 
-] PBS 
(2 CBBP,.., 
(© wild type [J wild type wild type 
g h i j 
Deleted in malignant brain tumors 1 cytochrome b-245 heavy chain Rag2" ye" 
(Dmbt1) (cybb) Calgranulin A (S100A8) 40" 
8 8 8 
ch 10° | @ op 
a < 7 27 > ae ft Di 
Ss A) o 2 10° | \e lo ba 
8 6 2 6 5 6 Re je] 
2 2 “ S sep ° 
g 5 2 5 25 2 
S M4 © i 107 pe 
2 4 @ 4 Ss 4 oO o® 
oO oO g fmeer 6 kik 
D 3 m3 & 3 i 10 | lala 
o) g 4 ao . 
2 2 z = 408 
= 2 g 2 x 2 10 
ge 4 © 4 5 1 


© cBBP 
1 wild type 


C wild type 


Extended Data Fig. 4 | CBBPscsx mediates VRE colonization resistance 
by producing an inhibitor. a, Antibiotic-treated mice (n = 8 mice 

from two independent experiments) received treatment by oral gavage 
containing CBBPscsx, CBBPontrols PBS or VRE. One week later, VRE 

was inoculated into the caecal content and growth was monitored 6 h 
after inoculation. b-i, Antibiotic-treated mice received an oral gavage 
containing CBBPscsx (n = 4 mice from one independent experiment) 

or PBS (n = 3 mice from one independent experiment). Wild-type mice 
(n = 4 mice from one independent experiment) were untreated and 
received no antibiotics. Four days later, RNA and proteins were extracted 
from the distal ileum, and RegIIIy was measured by RT-qPCR (b) and 
western blot (c). Other genes involved in host-derived antimicrobial 
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peptide production, including angiogenin-4 (ang4) (d), defensin-1 (def1) 
(e), amphiregulin (areg) (f), and deleted in malignant brain tumours 1 
(dmbt1) (g); or inflammatory mediators including cytochrome b beta 
(cybb) (h) and calgranulin A (s100a8) (i) were measured by RT-qPCR. 
j, Rag2~/~Il2rg-'~ mice were treated with antibiotics, and orally gavaged 
with VRE. Three days later, VRE-dominated mice received CBBPscsx 
or CBBP control by oral gavage and VRE colonization was monitored by 
quantifying VRE in faecal samples. VRE (ATCC 700221) was used in 
experiments in a and j. *P < 0.05 (0.0286), ***P < 0.001, ****P < 0.0001, 
two-tailed Mann-Whitney U-test. Data are median + range (a) or mean + 


s.d. (b, d-i). 
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Extended Data Fig. 5 | See next page for caption. 


Extended Data Fig. 5 | BPscsx encodes a lantibiotic. a, VRE was 
inoculated in media conditioned with BPscsx or BP control Culture protein 
precipitate fractions (n = 8 biologically independent samples from two 
independent experiments), and monitored for growth. b, c, BPscsk 

was whole-genome sequenced, assembled and annotated. b, Schematic 
comparing the lantibiotic operon discovered in the genome of BPscsx 
to the nisA operon from L. lactis. Gene functions are based on the 
characterization of homologous genes in the nis operon. c, Amino 

acid sequence alignment comparing the BPscsx lantibiotic precursor 
(LanA,-LanA,) and the nisin-A precursor (NisA). Sequence features are 
based on the characterization of nisin. d, The molecular formula for the 
mature, post-translationally modified BPscsx LanA,-LanA, lantibiotic 
with a predicted mass of 3152.45 Da. Abu, alpha-aminobutyric acid; 
Dha, dehydroalanine;Dhb, dehydrobutyrine. e, Media conditioned with 
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BPscsx or BPcontrol Culture protein precipitates, or commercial nisin-A, 
were incubated with proteinase K for 3 h at 37°C, boiled at 100°C, or left 
untreated. The treated protein precipitate (n = 8 biologically independent 
samples from four independent experiments) was serially diluted and 
VRE was inoculated and cultured for 24 h. The MIC value was the highest 
mean dilution in which VRE inhibition was observed. f, Proteins were 
precipitated from BPgcsx or BPeontrob or nisin-A spiked cultures and 
applied to a SP sepharose column. Each fraction was serially diluted and 
VRE was inoculated and cultured for 24 h to determine the MIC (n = 4 
biologically independent samples from four independent experiments). 
VRE (ATCC 700221) was used in experiments in a, e and f. ***P < 0.001, 
** PD < 0.0001, two-tailed Mann-Whitney U-test for comparisons with a 
negative control. Data are mean + s.d. (a, f) or mean values (e). 
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Extended Data Fig. 6 | Heterologous expression of BPscsx LanA,-LanA4 
lantibiotic. a, Genes involved in biosynthesis of the BPscsx lantibiotic 
(His-tagged-LanA, LanB and LanC) were cloned into expression vectors 
(pRSFDuet-1/LanA+LanB, pCDFDuet-1/LanC) and heterologously 
expressed in E. coli. a, Schematic map indicating where each lantibiotic 
gene was inserted into the respective expression vectors. b, c, The His- 
tagged LanA,-LanA, lantibiotic was purified from E. coli lysates by 

HiTrap HP nickel affinity chromatography and subsequently purified to 
homogeneity by reversed-phase high-performance liquid chromatography. 


The leader sequence and His tag were removed by trypsin digestion to 
yield the mature lantibiotic. The purified His-tag product (b) and the 
purified mature lantibiotic (c) were analysed by electrospray ionization- 
mass spectrometry (ESI-MS) and the spectrum was deisotoped and 
deconvoluted using the Xtract algorithm in Xcalibur. The signals with 
labels correspond to the predicted mass of the His-tagged lantibiotic 
(M) and its incomplete forms that did not dehydrate all nine residues 
(for example, M + 1-H2O and M + 2-H20). 
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Extended Data Fig. 7 | Oral administrations of BPscsx protein 
precipitate reduce VRE colonization in vivo. Antibiotic-treated mice 
(n = 9 mice from three independent experiments) were administered 
BPscsxk Or BP control protein precipitate. Three hours later, VRE was orally 
gavaged, followed by oral administrations of BPscsx or BPontrol protein 
precipitate every 3 h for 12 h and VRE colonization was monitored by 
quantifying VRE in faecal samples. VRE (ATCC 700221) was used. 

*P = 0.0232, two-tailed Mann-Whitney U-test. Data are mean + s.d. 
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Extended Data Fig. 8 | The BPscsx lantibiotic has a narrower spectrum 
of activity against Gram-positive commensal strains. a, VRE was 
inoculated in media conditioned with BPscsx, L. lactis or BP control 

culture protein precipitate (n = 4 biologically independent samples from 
four independent experiments) and growth was monitored 24 h after 
inoculation. b, c, Culture broth was conditioned with proteins precipitated 
from BPscsx; BP control Of commercial nisin-A and serially diluted. The 
MIC value was determined for common nosocomial pathogens (b) or 

158 strains from a commensal biobank (n = 2 biologically independent 


samples from two independent experiments) (c) by calculating the 
highest dilution factor that inhibited growth. The resistance index is a 
ratio between MIC of BP.ontroi-conditioned media over the MIC of BPscsx 
or nisin-A-conditioned media (b). The lantibiotic sensitivity ratio was 
calculated as the MIC of nisin-A to the MIC of the BPscsx lantibiotic for 
each strain (c). *P < 0.05, ****P < 0.0001, two-tailed Mann-Whitney 
U-test for comparisons with a negative control (a) or between two 
experimental conditions (b, c). Box plots are as defined in Fig. 2. 
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Extended Data Fig. 9 | Identification of lantibiotic sequences from 
metagenomic sequencing of healthy human faecal samples. a, The 
profile hidden Markov model used to identify the gallidermin superfamily 
domain, illustrated as a logo. b, Multiple sequence alignment of lantibiotic 
precursor sequences identified from shotgun sequencing of healthy- 
donor faecal samples. Detected lantibiotic sequences are the assembly 

of lantibiotic reads from shotgun metagenomic faecal samples. c, A total 
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of 421 species were individually isolated from healthy human faecal 
samples, whole-genome sequenced, assembled, annotated and mined for 
lantibiotic precursor sequences to identify a strain of R. faecis encoding a 
homologous lantibiotic. The precursor lantibiotic sequence is compared to 
the sequences of BPscsx LanA,-LanA, lantibiotic and nisin-A by multiple 
alignment. 
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Extended Data Fig. 10 | Lantibiotic sequences identified from 
metagenomic sequencing of hospitalized patient faecal samples. 
a, Stacked heat map matrices represent a single patient. The top row 
illustrates abundance of the lantibiotic gene (RPKM). The bottom row 
illustrates relative abundance of E. faecium (percentage of 16S). Columns 
represent the sample collection day relative to transplant. 
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provide a protective barrier for the joint 
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Macrophages are considered to contribute to chronic inflammatory 
diseases such as rheumatoid arthritis!. However, both the 
exact origin and the role of macrophages in inflammatory joint 
disease remain unclear. Here we use fate-mapping approaches 
in conjunction with three-dimensional light-sheet fluorescence 
microscopy and single-cell RNA sequencing to perform a 
comprehensive spatiotemporal analysis of the composition, 
origin and differentiation of subsets of macrophages within 
healthy and inflamed joints, and study the roles of these 
macrophages during arthritis. We find that dynamic membrane- 
like structures, consisting of a distinct population of CX;CR1+ 
tissue-resident macrophages, form an internal immunological 
barrier at the synovial lining and physically seclude the joint. 
These barrier-forming macrophages display features that are 
otherwise typical of epithelial cells, and maintain their numbers 
through a pool of locally proliferating CX;CR1~ mononuclear 
cells that are embedded into the synovial tissue. Unlike recruited 
monocyte-derived macrophages, which actively contribute to joint 
inflammation, these epithelial-like CX;CR1* lining macrophages 
restrict the inflammatory reaction by providing a tight-junction- 
mediated shield for intra-articular structures. Our data reveal 
an unexpected functional diversification among synovial 
macrophages and have important implications for the general role 
of macrophages in health and disease. 

The healthy synovial cavity is a fluid-containing sterile space that 
lacks immune cell trafficking. During inflammatory joint diseases such 
as rheumatoid arthritis, increasing numbers of mononuclear phago- 
cytes and synovial fibroblasts are thought to contribute to an expanding 
synovial pannus that drives the destruction of articular cartilage and 
bone?~*. Previous work that addressed the role of monocytes and mac- 
rophages during arthritis accordingly suggested that these cells promote 
both the onset and the progression of joint inflammation!*-®, a scenario 
that has substantially shaped our current view on the role of these cells 
during inflammatory disease in general. 

More recent studies have questioned the concept that macrophages 
uniformly originate from blood monocytes, and have shown that cer- 
tain subsets of macrophages populate organs during early development 
and subsequently self-sustain their numbers in a monocyte-independent 
manner?~”. Individual subsets of such resident macrophages have distinct 
transcriptional and epigenetic signatures, which suggests that they 
have highly specialized and tissue-specific functions'*"!°. These recent 
insights prompted us to question prevailing paradigms and to revisit 


the origin and function of synovial macrophages during homeostasis 
and inflammatory joint disease. 

CX3CR1 is a chemokine receptor that is specifically used by mono- 
nuclear phagocytes and their precursors!'. To visualize the spatial dis- 
tribution of CX;CR1* macrophages and macrophages originating from 
CX;CRI* precursors, respectively, we performed confocal immunoflu- 
orescence microscopy and three-dimensional light-sheet fluorescence 
microscopy of optically cleared knee joints in Cx3cr1’Rosa26(R26)- 
tdTomato mice (Fig. 1a, b, Supplementary Video 1). This approach 
revealed membrane-like structures of synovial tdTomato* mac- 
rophages that formed a dense physical barrier between the synovial 
capillary network and the intra-articular space, thereby secluding the 
joint space from the exterior (Fig. 1b, c, Supplementary Videos 2-4). 
Analysis of Col VI°°R26-tdTomato reporter mice showed that these 
macrophages formed the uppermost cellular layer and covered the 
lining of collagen VI-expressing synovial fibroblasts (Extended Data 
Fig. 1a). In Cx3cr1°? mice, we confirmed that such membrane-forming 
lining macrophages selectively expressed CX3CRI, stained positive 
for CD68 and F4/80, and constituted 40% of the total synovial mac- 
rophages under steady-state conditions. By contrast, interstitial synovial 
macrophages did not express CX3CR1 (Extended Data Fig. 1a, b). 

Next, we studied the response of macrophages during K/BxN 
serum-transfer arthritis (STA) and collagen-induced arthritis as mouse 
models of rheumatoid arthritis. The onset of inflammation resulted in 
a rapid change in the morphology and spatial orientation of CX3CR1* 
macrophages that suddenly abrogated cell-cell contacts (Fig. 1d, 
Extended Data Fig. 1c-h, Supplementary Videos 5-7). Simultaneously, 
collagen VI-expressing fibroblasts started occupying the synovial sur- 
face (Extended Data Fig. 1f) and Ly6G* polymorphonuclear leucocytes 
(PMNs) appeared within the intra-articular space. Dying PMNs were 
subsequently removed by lining macrophages that had acquired a pal- 
isade-like shape (Fig. 1d, Extended Data Fig. 1c-e). 

During embryonic development, we detected CX3;CRI1* synovial 
lining macrophages by embryonic day (E) 15.5 and E16.5, which indi- 
cates that synovial macrophage precursors derive from early embryonic 
haematopoiesis (Extended Data Fig. 2a). Adult parabiotic wild-type 
mice that shared circulation with Cx3cr1° mice displayed chimeric 
myeloid cell populations within the peripheral blood, but no detectable 
chimerism among CX3CRI1* synovial lining macrophages (Fig. 2a). 
These data suggested that, in the adult mouse, this subset of mac- 
rophages maintained its numbers independent of blood monocytes. 
Analysis of Ki67 expression revealed no signs of proliferation within 
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Fig. 1 | CX;CR1* lining macrophages form a dynamic membrane-like 
structure around the synovial cavity. a-d, Representative 3D light- 
sheet fluorescence microscopy (LSFM) and confocal laser scanning 
microscopy (CLSM) of knee joints of Cx3cr1“°R26-tdTomato mice (LSFM, 
n= 10; CLSM, n =3). a, The spatial localization of synovial macrophages 
(tdTomato, red) and PMNs (Ly6G, green) are shown during steady state 
(autofluorescence (AF), grey). Arrowheads indicate the localization 

of the macrophage layer (tdTomato, red) at the border of the synovial 
cavity (sc). bm, bone marrow; m, meniscus. Scale bars, 500 zm (left), 

100 jm (right). b, Top, LSFM analysis of the spatial arrangement of the 
synovial macrophage lining (tdTomato, red; arrowheads) and CD31* 


the layer of CX3;CR1* lining macrophages (Fig. 2b), thus raising ques- 
tions regarding their mechanism of repopulation and turnover. Notably, 
we detected clusters of proliferating Ki67*CX3CR1 interstitial cells 
within deeper layers of the synovial tissue (Fig. 2b). The Ki67 signal 
decreased in upper cellular layers in which CX3CRI expression simulta- 
neously increased, suggestive of a pool of proliferating CX3CR1~ inter- 
stitial macrophages that contributed to the pool of CX3CRI1* lining 
macrophages. Antibody-mediated staining of the M-CSF receptor 
(CSF1R) within the synovial tissue, as well as mice expressing a Csflr 
promoter-driven GFP (Csf1 rFP mice), showed that only CX3;CRI— 
interstitial macrophages—and not CX3CRI1* lining macrophages— 
expressed CSF1R on their surface (Extended Data Fig. 2b, c). 
We therefore crossed R26-tdTomato mice with mice expressing a 
Csflr promoter-driven and tamoxifen-dependent Cre recombinase 
(Csfir""8R26-td Tomato), an approach that enabled fate mapping of 
all cells that had expressed CSF1R at a certain developmental stage. 
After the start of tamoxifen treatment, lining CX;CR1* macrophages 
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endothelial cells (blue) along the synovial cavity in three dimensions (AF, 
grey). Scale bars, 100 zm. Bottom, high-resolution 3D reconstruction of a 
processed CLSM scan of the synovial macrophage lining (tdTomato, red; 
Phalloidin, green; DAPI, blue). Scale bar, 10 jum. c, Three-dimensional 
reconstruction of LSFM data of the spatial orientation of synovial 
macrophages (tdTomato, red) and CD31* endothelial cells (blue) of the 
synovial capillary network (AF, grey). Scale bars, 100 jum. d, CLSM of the 
synovial membrane visualizing synovial macrophages (tdTomato, red) and 
PMNs (Ly6G, green) at the indicated time points upon the induction of 
K/BXxN STA. Scale bars, 20 jum (top), 5 pm (bottom). ac, articular cartilage; 
st, synovial tissue. 


acquired tdTomato expression only gradually over time, reaching 65% 
tdTomato™ cells after 6 weeks. These data indicate that CX3CR1* lin- 
ing macrophages have a half-life of approximately five weeks and did 
indeed originate from local CSF1R-expressing CX3CRI1~ interstitial 
macrophages (Fig. 2c, d). 

Additional characterization of parabiotic wild-type mice that shared 
circulation with DsRed-transgenic mice confirmed that, during steady 
state, both CX3CRI1* lining macrophages and CX3CR17 interstitial 
macrophages maintained their numbers independent of monocytes 
(Extended Data Fig. 2d-i). Although blood-derived monocytes con- 
tributed to the pool of synovial macrophages during STA, this influx 
only partially accounted for the inflammation-induced increase in 
macrophage numbers (Fig. 2e, f, Extended Data Fig. 2), k), which indi- 
cates an increased proliferative response of tissue-resident synovial 
macrophages during arthritis. To differentiate between the prolifera- 
tion of CX;CR17 interstitial macrophages and that of CX3CR1* lining 
macrophages, we crossed R26-tdTomato mice with mice that express 
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Fig. 2 | CX3CR1* lining macrophages repopulate locally from CSF1R- 
expressing interstitial macrophages. a, Bright-field fluorescence 
microscopy of the synovial membrane of knees of Cx3cr1°" (left) and 
corresponding parabiotic wild-type mice (right) (n = 3) after 6 weeks 

of parabiosis (GFP, green; CD68, red; DAPI, blue). Scale bars, 25 jum. 

b, Bright-field fluorescence microscopy of the synovial membrane 

of knees of Cx3cr1°"? mice (n = 3) determining proliferation among 
subsets of macrophages (GFP, green; Ki67, white; CD68, red; DAPI, 
blue). Scale bars, 250 1m (left), 25 jum (right). c, d, CLSM (c) and 
quantification (d) of tdTomatot macrophages within the synovial 

lining of Csf1r°"?®R26-tdTomato mice at the indicated time points 
during tamoxifen treatment. n = 3 for each time point. In c, tdTomato, 
red; F4/80, green; DAPI, blue. Scale bars, 10 j1m. In d, the dotted line 
represents the 95% confidence band of linear regression. e, Synovial 
tissue chimerism ratio of CD45*CD11btF4/80* macrophages and Ly6G* 
PMNs of parabiotic DsRed/wild-type mice after 6 weeks of parabiosis 
under steady-state conditions (day 0: macrophage, n = 6; PMNs, n = 5) 


a gene encoding a tamoxifen-dependent Cre recombinase within the 
Cx3cr1 locus (Cx3cr1?8R26-tdTomato mice). Tamoxifen treatment 
resulted in selective and continuous expression of tdTomato in the layer 
of CX;CRI1* synovial lining macrophages. A proportion of CX;CR1* 
blood monocytes was initially marked after systemic tamoxifen treat- 
ment, but became rapidly replaced by newly generated tdTomato™ 
monocytes. Induction of arthritis four weeks after a systemic tamoxifen 
pulse or a local tamoxifen injection enabled selective fate mapping of 
tissue-resident CX;CR1* lining macrophages during steady state and 
inflammation, and therefore enabled discrimination from interstitial or 
monocyte-derived macrophages (Fig. 2g—i, Extended Data Fig. 3a-f). 
In conjunction with 5-ethynyl-2’-deoxyuridine (EdU) labelling, this 
approach confirmed that—at the onset of arthritis—CX3CRI1* lining 
macrophages changed their spatial orientation and morphology but 
maintained their position, and neither proliferated nor changed in 
number. By contrast, CX3CR1~ macrophages rapidly proliferated and 
accordingly increased in number (Fig. 2h-j, Extended Data Fig. 3f-h). 

Bulk RNA sequencing of sorted steady-state CX3CR1* lining mac- 
rophages and CX3CRI1” interstitial synovial macrophages showed 
that they did indeed represent distinct macrophage populations, 
both of which are only distantly related to bone-marrow-derived 
macrophages (Fig. 3a, b, Extended Data Fig. 4a-e). In addition, 
unbiased molecular profiling of total synovial CD45*CD11b*Ly6G~ 
mononuclear phagocytes by single-cell RNA sequencing 
(scRNA-seq) confirmed the presence of the defined cluster of differ- 
entiated CX3CR1* lining macrophages, which co-expressed immune- 
genes such as Trem2 or Vsig4 (Fig. 3c, d, Supplementary Table 1). 
However, scRNA-seq also revealed an additional degree of heteroge- 
neity among CX3CR1~ interstitial macrophages. A large number of 
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and 5 days after induction of K/BxN STA (macrophage, n = 7; PMNs, 
n=7). Data are mean + s.e.m.; two-tailed Student’s t-test, **P = 0.004. 
f, Absolute numbers of total and blood-derived CD45*+CD11b*F4/80* 
macrophages under steady-state conditions (day 0: total macrophages, 

n = 6; blood-derived macrophages, n = 6) and at day 5 of STA (total 
macrophages, n = 8; blood-derived macrophages, n = 7). Data are 
mean + s.e.m.; two-tailed Student’s t-test, *P = 0.0261. g-j, Synovial tissue 
of Cx3cr1‘*®R26-tdTomato mice was analysed 4 weeks after tamoxifen 
pulse to determine EdU incorporation (EdU pulse 4 h before collection) 
into CD45*CD11b*F4/80* macrophages (h) and tdTomato expression 
within the synovial lining at indicated time points upon the induction 
of STA (F4/80, white; Ly6G, green; tdTomato, red; DAPI, blue) (i). Scale 
bars, 10 jum. j, Quantification of total proliferating EdU* tdTomatot 
and tdTomato™ macrophages from the paws of tamoxifen-pulsed 
Cx3cr1=®R26-tdTomato mice at day 0 (n = 6), 2 (n =5), and 5 (n= 6) 
after the induction of STA. Data are mean + s.e.m. 


interstitial CX;CR1~ macrophages expressed relatively high levels 
mRNAs encoding MHCII and aquaporin (AQP1), whereas another 
population of interstitial CX3CR1~ macrophages was character- 
ized by the expression of Retnla (which encodes RELM-a) and 
additional genes—such as Mrc1 or Cd163—that have previously 
been implicated in the alternative activation of macrophages 
(Fig. 3c, d). There was also a smaller population of interstitial 
Stmn 1-expressing CX3;CR1~ macrophages that were clustered pri- 
marily as a result of their high expression of cell-cycle-associated 
genes such as Cdk1; this suggests that they were not a distinct cellular 
population but instead were proliferating interstitial MHCII* and 
AQP1+CX;CR1~ macrophages (Fig. 3c, d, Extended Data Fig. 4f, g). 
Pseudotime analyses indicated that both RELM-a* macrophages and 
CX3CRI1* lining macrophages were differentiated macrophages that 
originated from the cluster of proliferating MHCII*CX3CRI1~ intersti- 
tial macrophages (Extended Data Fig. 4h-j). This analysis additionally 
suggested that the initial proliferation of interstitial MHCII*CX3CR1- 
macrophages was followed by an upregulation of mRNAs encoding 
the transcription factors MAFB and MAK, which have previously been 
shown to interfere with macrophage proliferation!’ (Extended Data 
Fig. 4j). Fate mapping in tamoxifen-treated CsfIr"®R26-tdTomato 
mice confirmed that—in accordance with this pseudotime model— 
interstitial MHCII* macrophages immediately responded with the 
expression of tdTomato, whereas interstitial RELM-a* macrophages 
and CX;CR1* lining macrophages only gradually acquired tdTomato 
expression over time (Extended Data Fig. 4k-m). Analysis of 
Retnla‘°R26-tdTomato mice and Cx3cr1°"8R26-td Tomato mice indi- 
cated that both subsets of macrophages represented the end stages of 
synovial macrophage differentiation, because we detected very few 
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Fig. 3 | Transcriptional profiling of synovial macrophage subsets. 

a, b, Principal component (PC) analysis (a) and differential gene 
expression (b) of sorted synovial CD45*CD11b*F4/80*GFP* lining 
macrophages and CD45*CD11b*F4/80*GFP~ interstitial macrophages 
of Cx3cr1°? mice, and in vitro-cultured bone-marrow-derived 
macrophages (BMDMs) of C57BL/6 mice during steady state (n = 3) 
after bulk RNA sequencing. Differential expression analysis was 
performed with DESeq2. The Wald test was used to calculate two-sided 
P values; adjustment for multiple comparisons was performed with the 


cells that had expressed RELM-a within the synovial lining (Extended 
Data Fig. 4n) or cells that had expressed CX3CRI1 within the intersti- 
tial synovial tissue (Fig. 2i). As expected, continuous diphtheria toxin 
(DT)-mediated depletion of CSF1R-expressing synovial macrophages 
in Lysm’CD115DTR mice resulted in the complete depletion of inter- 
stitial MHCII* macrophages, whereas the density of CX3CRI1* lining 
macrophages decreased only slowly with time. Cessation of DT treat- 
ment led to rapid repopulation of the pool of proliferating interstitial 
MHCII* macrophages, whereas the repopulation of CX3;CR1* lining 
macrophages was delayed (Extended Data Fig. 40, p). Together, these 
experimental datasets supported the scRNA-seq-based pseudotime 
model of a dynamic continuum within resident synovial macrophages, 
in which proliferating MHCII*CX;CRI1~ interstitial macrophages 
further differentiate either into CX3CRI1* lining macrophages or 
RELM-a‘* interstitial macrophages (Extended Data Fig. 5q). 
scRNA-seq confirmed that the onset of STA resulted in the appear- 
ance of additional clusters of mononuclear phagocytes that displayed 
the signature of monocyte-derived macrophages; these clusters 
expanded during the progression of arthritis. These mostly Ccr2- and 
Ly6c2-expressing cells displayed a pro-inflammatory activation pro- 
file, including the expression of I/1b (Extended Data Fig. 5a, b and 
Supplementary Table 2). Tamoxifen-pulsed Cx3cr1°"®R26-tdTomato 
mice enabled us to fate map tdTomato-expressing CX3;CR1* lining 
macrophages within the generated scRNA-seq datasets throughout the 
course of STA. This analysis showed that—despite their increasingly 
inflammatory microenvironment—lining macrophages stably main- 
tained their immune-regulatory phenotype, including the expression 
of Trem2 and of high levels of receptors that mediate the clearance of 
apoptotic cells, such as Axl and Mfge8 (Extended Data Fig. 5b, c). A 
comparison of available sCRNA-seq datasets from human rheumatoid 
arthritis synovium’ showed that the expression profiles of two of four 
recently described subsets of human synovial monocytes (SC-M2 and 
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Benjamini-Hochberg method. c, d, t-distributed stochastic neighbour 
embedding (t-SNE) scRNA-seq profiles (c) and dot plot (d) showing the 
average expression level of selected marker genes of the respective clusters 
of sorted synovial CD45+CD11b*Ly6G~ mononuclear phagocytes of 
Cx3cr1"®R26-tdTomato mice analysed 4 weeks after tamoxifen pulse 
during steady-state conditions (n = 7,362 cells). The average expression 
level corresponds to all cells expressing the certain gene. The size of the 
dots represents the percentage of cells expressing a gene. 


SC-M3) matched the profiles of mouse resident synovial macrophages, 
whereas the other two subsets of human synovial monocytes (SC-M1 
and SC-M4) resembled mouse monocyte-derived synovial mac- 
rophages (Extended Data Fig. 5d). 

Notably, CX3CR1* lining macrophages also displayed features that 
are otherwise typical of barrier-forming epithelial cells, and expressed 
mRNAs encoding tight-junction proteins such as JAM] (F11r), ZO-1 
(Tjp1) and claudin 5 (Cldn5), as well as genes involved in planar cell 
polarity, including Fat4 and Vangl2 (Extended Data Figs. 4a-e, 5b). 
Consistent with this, confocal immunofluorescence microscopy and 
transmission electron microscopy images showed the expression of 
tight-junction and gap-junction proteins, as well as the presence of 
definite tight junctions, adherens junctions, desmosomes and promi- 
nent cellular interdigitations at the cell-cell border of CX3CR1* lining 
macrophages (Extended Data Figs. 6, 7a—d). The results of confocal 
immunofluorescence microscopy and flow cytometry of human syn- 
ovial tissue confirmed a dense macrophage lining, consisting of closely 
associated TREM2* macrophages that also expressed tight-junction 
proteins (Extended Data Fig. 8a-c). These TREM2*MHCII- mac- 
rophages comprised 10-30% of the total human synovial macrophages 
during steady state in samples derived from patients with osteoarthritis, 
but were seemingly outnumbered by TREM2~ mononuclear phago- 
cytes that dominated the disrupted synovial lining of patients with 
rheumatoid arthritis (Extended Data Fig. 8b-e). 

Tight junctions between synovial lining macrophages rapidly disin- 
tegrated both during STA and during human rheumatoid arthritis, cor- 
relating with the changing physical density of this macrophage network 
during the onset and resolution of inflammation (Fig. 4a, b, Extended 
Data Figs. 6, 7e-i, 8a—e, 9a). Magnetic resonance imaging confirmed 
that this disintegration of the tight-junction-mediated barrier of mac- 
rophages was accompanied by an increased intra-articular influx of 
contrast agent during the initiation of STA (Fig. 4c, Extended Data 
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Fig. 4 | CX;CR1* synovial lining macrophages provide a tight junction- 
mediated anti-inflammatory barrier for the joint. a, b, LSFM-derived 
3D reconstruction of spatiotemporal changes (a) and calculated lining 
density (b) of tdTomato* macrophages from the knees of Cx3cr1°R26- 
tdTomato mice (n = 5) at the indicated days (0-7) after the induction of 
K/BXxN STA. Data are mean + s.e.m.; Kruskal-Wallis H-test with 
Dunn’s multiple comparison test, **P = 0.0025, *P = 0.034. Scale bars, 
100 j1m. c, Representative magnetic resonance imaging analysis of knee 
joints at the indicated days (n = 4 each day) of STA showing sagittal 
T1-weighted images after the application of contrast agent (top) and 
transverse T1-weighted images after the application of contrast agent 
and merged with T1-weighted dynamic-contrast-enhanced colour 
maps (bottom). The merged magnetic resonance images include a 
colour-coded map of the area under the curve (AUC) of contrast-agent 
accumulation over 12 min (bottom), ranging from yellow (high AUC 
values) to red (low AUC values). d, Depletion strategy for CX3CR1* 
lining macrophages. Cx3cr1°iDTR and iDTR control mice received 

2 x 500 ng DT intraperitoneally 5 days before the administration of 
K/BxN serum. e, Representative bright-field fluorescence microscopy 
of the synovial lining on day 5 after the application of DT (phalloidin, 
green; F4/80, red; DAPI, blue). f, Normalized signal intensity curves 


Fig. 9b). This inflammation-associated barrier breakdown occurred 
after the deposition of autoantibody-containing immune complexes, 
which were immediately ingested by CX3CR1* lining macrophages 
(Extended Data Fig. 9c, d). Depletion of PMNs and Ly6C"8" inflam- 
matory monocytes did not interfere with barrier breakdown, indicating 
that disintegration of tight junctions within the synovial macrophage 
lining resulted from an initial immune-complex-mediated activation of 
CX3CRI1* lining macrophages and was independent of the recruitment 
of inflammatory myeloid cells (Extended Data Fig. 9e, f). To inves- 
tigate the role of tight-junction-expressing CX3CR1* macrophages 
during arthritis, we crossed Cx3cr1“ mice with mice containing a 
Cre-inducible DT receptor (Cx3cr1™R26-iDTR mice), allowing for 
the DT-mediated depletion of CX;CRI* resident synovial lining mac- 
rophages. This protocol resulted in the additional depletion of a propor- 
tion of blood monocytes; however, with the exception of tissue-resident 
macrophages, they repopulated within 48 h (Fig. 4d, e, Extended Data 
Fig. 9g). Magnetic resonance imaging on day 5 after DT injection con- 
firmed that the selective absence of CX3CR1* macrophages indeed 
abolished the synovial barrier in healthy mice, which corresponded to 
the barrier breakdown observed during the onset of arthritis (Fig. 4f). 
Both systemic and local depletion of CX3;CR1* lining macrophages—as 
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from the dynamic-contrast-enhanced magnetic resonance imaging of 
synovial tissue from the knee joints of DT-treated mice at the indicated 
days of STA, over 90 measurements with intervals of 7 s in Cx3cr1°°iDTR 
(n = 8 knee joints) and iDTR control mice (m = 10 knee joints). Data are 
mean + s.e.m. of AUC, two-tailed Student's t-test; day 0, ***P = 0.0003; 
day 1, ***P = 0.0001. g, Clinical course of STA, including AUC of the 
corresponding clinical index, in Cx3cr1°°iDTR (n = 10) and iDTR 
control (n = 9) of DT-treated mice. Data are mean + s.e.m.; two-tailed 
Mann-Whitney U-test for clinical index with *P < 0.05 and **P < 0.01; 
two-tailed Student’s t-test for AUC, **P = 0.0093. h, Clinical course of 
STA, including AUC of the corresponding clinical index, in C57BL/6 
wild-type mice treated with imatinib (80 jg kg~', oral gavage, twice daily, 
n= 4) or vehicle (n = 6) starting one day before the induction of STA. 
Mean + s.e.m.; Mann-Whitney U-test for clinical index with *P < 0.05 
and **P < 0.01; two-tailed Student’s t-test for AUC with ***P = 0.0001. 
i, Clinical course of STA, including AUC of the corresponding clinical 
index, in LysM“°CD115DTR (n = 7) and CD115DTR control mice 

(n = 10) treated with DT (500 ng per mouse intraperitoneally) starting one 
day before STA induction followed by a daily intraperitoneal injection of 
100 ng DT. Mean + s.e.m.; Mann-Whitney U-test for clinical index with 
*P < 0.05; two-tailed Student’s t-test for AUC, *P = 0.0417. 


well as forced disintegration of tight junctions upon the injection of 
claudin 5 peptidomimetics—resulted in a disrupted barrier function, 
an early and exacerbated onset of arthritis and accelerated PMN influx 
(Fig. 4g, Extended Data Fig. 9h-k). Together, these data suggest that 
this subset of macrophages exerts an important immune-regulatory 
function by maintaining a physical and functional tight-junction- 
mediated barrier that secludes and protects intra-articular structures 
and thereby controls the onset of inflammation. Consistent with this, 
the tyrosine kinase inhibitor imatinib—which has been shown to 
stabilize the formation of tight junctions at the blood-brain barrier'’— 
was found to interfere with the onset of arthritis (Fig. 4h). During 
STA in Lysm“°CD115DTR mice, DT-mediated depletion of CSFIRt 
monocytes and macrophages occurred without directly targeting 
CSF1R-CX3CRI* lining macrophages—an intervention that did not 
affect the onset of arthritis but, in accordance with a pro-inflammatory 
role of monocyte-derived macrophages, accelerated the resolution of 
inflammation (Fig. 4i, Extended Data Fig. 91). 

Our data reveal a complex functional specialization within synovial 
macrophage subsets and demonstrate the divergent roles of different 
tissue-resident and monocyte-derived macrophages during home- 
ostasis and inflammation. The identification of an internal, locally 


renewing and protective tight-junction-mediated macrophage bar- 
rier has important implications for our general understanding of the 
role of macrophages in health and disease (Extended Data Fig. 10). 
Other tissue-resident macrophages might use similar mechanisms 
to fulfil related gatekeeping functions, thereby determining the onset 
and resolution of inflammation, modulating host defence!®, preventing 
neutrophil-driven inflammatory tissue damage”? or potentially 
interfering with an anti-tumour immune response. 
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METHODS 


For additional information on materials used, see Supplementary Table 3. 
Ethical compliance. We complied with all relevant ethical regulations in terms 
of animal experiments and human samples in this study. All animal experiments 
conducted at the University of Erlangen were performed in accordance with 
German guidelines and laws, were approved by local animal ethic committees of 
the Regierung von Mittelfranken and were conducted according to the guidelines 
of the Federation of European Laboratory Animal Science Associations. Parabiosis 
experiments were approved by the Animal Care and Ethics Committee of the 
Centro Nacional de Investigaciones Cardiovasculares and local authorities. 

Synovial biopsies were obtained from knee joints of patients diagnosed with 

osteoarthritis or rheumatoid arthritis. Patients with rheumatoid arthritis fulfilled 
the 2010 EULAR/ACR criteria of rheumatoid arthritis. All patients were >18 years 
of age. Patients with osteoarthritis were recruited at the Department of Trauma 
Surgery, University Hospital Erlangen and patients with rheumatoid arthritis 
were recruited at the Department of Internal Medicine 3 - Rheumatology and 
Immunology, University Hospital Erlangen. All patients signed an informed con- 
sent form, which was approved by the local ethics committee of the University 
Hospital Erlangen. 
Mice. For all experiments, mice of both sexes were used. For details regarding 
mouse strains, see Supplementary Table 3. All mice used were aged between 8 and 
18 weeks, unless stated otherwise. No statistical methods were used to predeter- 
mine sample size. 

To generate Cx3cr1™°R26-tdTomato mice, STOCK Tg(Cx3cr1™)MW126Gsat/ 

Mmucd mice (identification number 036395-UCD) were obtained from the Mutant 
Mouse Regional Resource Center (MMRC), a National Institutes of Health (NIH)- 
funded strain repository, and were donated to the MMRRC by the National Institute 
of Neurological Disorders and Stroke (NINDS)-funded GENSAT BAC transgenic 
project. B6.129P2(C)-Cx3cr1'""! (cre/ERT2)Jung yy (Cx 3cr 1ER?), FVB-Tg(Csflr-cre/ 
Esr1*)1Jwp/J (CSF1IR®8*), C57BL/6-Gt(ROSA)26Sortm1(HBEGF)Awai/J 
(iDTR), B6;129S6-Gt(ROSA)26Sort"(CAG-td Tomato) Hze/7 (td Tomato), C57BL/6- 
Tg(Csflr-HBEGF/mCherry)1Mnz/J (CD115DTR) and B6.129P-Cx3cr1tm1Litt/J 
(Cx3cr1%*) mice were purchased from Jackson Laboratories. These mice were 
bred and housed at the animal facilities of the University of Erlangen under 
specific-pathogen-free conditions. ColVI°* mice were generated in the labora- 
tory of G. Kollias and have previously been described”!. Retnla"’R26-tdTomato 
mice were generated in the laboratory of D. Véhringer”. 
In vivo treatments and arthritis induction. K/BxN STA was induced by the 
injection of K/BxN serum collected from arthritic K/BxN mice. Clinical develop- 
ment of arthritis was evaluated using a clinical index ranging from 0 (minimum) 
to 16 (maximum), which represents a cumulative score of 0 to 4 for each paw, 
with 0 = no signs of inflammation; 1 = minor swelling and reddening of a paw, 
or affecting only single digits; 2 = moderate swelling and erythema, or affecting 
multiple digits per paw; 3 = severe swelling and erythema affecting the whole paw; 
4 = maximum swelling and erythema. Measurements of hind-paw swelling were 
conducted using dial thickness gauges (Peacock). 

Cx3cr17"®tdTomato mice were treated systemically with 4 mg tamoxifen 
dissolved in peanut oil intraperitoneally (i-p.) twice within 48 h. Csf1r"’=®R26- 
tdTomato mice were fed with tamoxifen-containing food (400 mg kg“! tamoxifen 
citrate, Envigo) for 5 days, 4 weeks or 6 weeks. Local administration of (Z)-4- 
hydroxytamoxifen (Sigma-Aldrich, H7904) was performed by intra-articular injec- 
tion of 25 iil of (Z)-4-hydroxytamoxifen (2 mM) dissolved in PBS/4% ethanol into 
the right knee joint. Five days after injection, K/BxN serum transfer arthritis was 
induced. Mice were analysed seven days after the induction of arthritis. 

For systemic administration of DT, Cx3cr1°iDTR mice received 500 ng DT 
per mouse i.p. on 2 consecutive days, beginning 6 days before the induction 
of STA. Local depletion of macrophages in Cx3cr1™iDTR mice was induced by 
the injection of 50 ng DT in 50 jl PBS directly into the hind paw 3 days before the 
induction of STA. The contralateral hind paw was injected with PBS and served 
as a control. 

LysM™CD115DTR mice received a single dose of 500 ng DT per mouse (i.p.) 
1 day before the induction of arthritis followed by daily injections of 100 ng DT per 
mouse (i.p.). To study the depletion of synovial macrophages in LysM"’CD115DTR 
mice under healthy conditions, 500 ng DT per mouse was injected intraperitoneally 
3 times a week until day 10. 

The claudin peptidomimetics”? (C5C2, sequence SSVVQSTGHMQSKV 
YESVLALSAEVQAAR-NH2) and scrambled variant (C5C2scr, AHLVRSVSD 
VMQSQTGKTSESYSAVQLVA-NH2) were dissolved in PBS and injected intra- 
venously (i-v.) (3.5 jumol kg~') one day before and one day after the induction of 
STA for clinical evaluation, with a single injection one day before the induction of 
STA for the magnetic resonance imaging experiments. 

Imatinib (80 jg kg!) was given orally twice a day, starting one day before the 
induction of STA. Imatinib was dissolved in aqueous vehicle solution containing 
0.5% (hydroxypropyl)cellulose and 0.05% TWEEN 80. 


K/BXxN serum IgG was isolated with protein-G Gravi-Trap (GE Healthcare, 
28-9852-55) and labelled using SAIVI Alexa Fluor 647 Antibody/Protein 1 mg- 
Labelling Kit. Labelled IgGs were injected iv. with K/BxN serum at a ratio of 1:4. 

Collagen-induced arthritis was induced as previously described“. In brief, 
C67BL/6 mice were immunized by intradermal injection of an emulsion of 200 jig 
of chicken type II collagen (Sigma-Aldrich, C-930) with 250 \1g heat-inactivated 
Mycobacterium tuberculosis H37RA in incomplete Freund’s adjuvant (Sigma- 
Aldrich, F5506) at day 0 and day 21. 

Depletion of neutrophils and Ly6C* monocytes was performed by i.v. injec- 
tion of 200 jg InVivoPlus anti-mouse Ly6G/Ly6C (Gr-1) (clone: RB6-8C5, Bio X 
Cell, BP0075) one day before K/BxN serum transfer. InVivoPlus rat IgG2b isotype 
control, anti-keyhole limpet haemocyanin (200 i)g, i.v., clone: LTF-2, Bio X Cell, 
BP0090) served as control. 

The severity of both K/BxN STA and collagen-induced arthritis was scored in a 
blinded manner. Mice were not randomized before the experiment. 

Parabiosis. Parabiosis was performed following a previously described proce- 
dure”®. Mice were shaved under anaesthesia at the corresponding lateral region 
and incisions were made from the olecranon to the knee joint of each mouse. 
Olecranons and knee joints of partner mice were tied together by a single 5-0 poly- 
propylene suture. Dorsal and ventral skins were stitched up forming a continuous 
suture. Finally, each mouse received a single injection of buprenorphine subcuta- 
neously. Analysis was performed as indicated six or nine weeks after surgery. The 
tissue chimerism was calculated as the quotient of the ratios of partner-derived 
macrophages of synovial joints and the ratio of partner-derived monocytes in the 
blood as quantified by flow cytometry. A tissue-to-blood chimerism of one thus 
represents an equal ratio between blood and tissue. 

Flow cytometry and fluorescence-activated cell sorting. For isolation of synovial 
macrophages, hind paws were dissected by removing skin, muscle and tendons. 
Cells were dissociated by incubation for 45 min in a digestion medium consist- 
ing of RPMI medium, 10% heat-inactivated fetal calf serum (FCS), collagenase 
(2 mg ml~!) from Clostridium histolyticum (Sigma-Aldrich, C5138-1G) and 
0.03 mg ml~! DNase (Sigma-Aldrich, 9003-98-9). After washing with PBS 
containing 2% heat-inactivated FCS and 2 mM EDTA, cells were blocked with 
10% rat serum in PBS for 10 min at room temperature and stained with fluoro- 
phore-conjugated antibodies for 20 min at 4°C. After washing with PBS, cells were 
resuspended in FACS buffer (PBS, 2% FCS). EdU, which was injected intraperi- 
toneally (50 mg kg~') 4h before collecting the cells, was detected using the EdU 
base click EdU-Flow Cytometry Kit 488 (BCK-FC488-100). 

Human synovial tissue was dissociated in digestion medium containing RPMI 
medium, 10% heat-inactivated fetal calf serum (FCS), collagenase (2 mg ml-!) 
from C. histolyticum (Sigma, C5138-1G) and 0.03 mg ml~' DNase (Sigma, 9003- 
98-9) at 37°C for 45 min. After washing with PBS, cells were incubated with 
Zombie Aqua (1:1,000, BioLegend) for 15 min at room temperature. Dissociated 
cells were fixed with 4% PEFA/PBS, incubating for 15 min at room temperature. 
After washing with 1% BSA/PBS, cells were resuspended in saponin-based perme- 
abilization buffer containing the following antibodies: CD11b-AF488, MHC II-PE, 
CD14-PeCy7, CD45-AF700, CD1c-PerCP/Cy5.5, CD20-BV421, CD15-BV421 and 
TREM2-APC, overnight at 4-8 °C. 

Flow cytometry was performed with a CytoFLEX S, Beckman Coulter. Sorting 

of cells was performed with a MoFlo XDP, Beckman Coulter and the Summit 
Software System. Data were analysed with Kaluza (Beckman Coulter, v.1.5a), 
CytExpert (Beckman Coulter, v.2.2.0.97) or FlowJo (v.7.6.5). 
Bulk RNA sequencing of sorted CX;CR1* and CX;CR1~ macrophages 
and bone-marrow-derived macrophages. Tissues from Cx3cr1°? mice 
were prepared for sorting of synovial macrophages as described in the section 
‘Flow cytometry and fluorescence-activated cell sorting. Macrophages were 
defined as CD45*CD11b*F4/80*. Expression of enhanced GFP discriminated 
CX3CRI1* lining and CX3CR1~ macrophages. For generating BMDMs, bone 
marrow cells were isolated from femurs of C57BL/6 mice and cultured for one 
day in DMEM (10% FCS, 1% penicillin-streptomycin). Non-adherent mac- 
rophage precursors were cultivated and differentiated to macrophages for 5 days 
in M-CSF-conditioned DMEM medium (10% FCS, 1% penicillin—-streptomycin). 
Isolation of RNA was performed using the RNeasy Mini kit (Qiagen, 74104). 
Libraries were subjected to single-end sequencing (101 bp) on a HiSeq-2500 
platform (Illumina). The obtained reads were converted to .fastq format and 
demultiplexed using bcl2fastq v2.17.1.14. Quality filtering was performed using 
cutadapt v. 1.15; then reads were mapped against the mouse reference genome 
(Ensembl GRCm338, release 91) using the STAR aligner v.2.5.4a7°, and a STAR 
genome directory created by supplying the Ensembl gtf annotation file (release 
91). Read counts per gene were obtained using featureCounts program v.1.6.17” 
and the Ensembl gtf annotation file. The subsequent analyses were performed 
using R v.3.5.0. In particular, differential expression analysis was performed 
with the DESeq2 package v.1.20.078 and plots were generated with the 
ggplot2_2.2.1 package. 


Quantitative real-time PCR of sorted CX;CR1* macrophages and BMDMs. 
RNA of sorted GFP* macrophages from Cx3cr1°!? mice and BMDMs was isolated 
using the RNeasy Mini kit (Qiagen, 74104). Reverse transcription of total RNA 
was performed with human leukaemia virus reverse transcriptase using the Gene 
Amp RNA PCR kit (Applied Biosystems) and oligo(dT)16 primers (Invitrogen). 
Quantification of gene expression was performed as previously described”*. The 
following primer sequences were used: B-actin: TGT CCA CCT TCC AGC AGA 
TGT (sense), AGC TCA GTA ACA GTC CGC CTA GA (antisense); ZO-1: GCT 
AAG AGC ACA GCA ATG GA (sense), GCA TGT TCA ACG TTA TCC AT 
(antisense); claudin 5: TTA AGG CAC GGG TAG CAC TCA GG (sense), TTA 
AGG CAC GGG TAG CAC TCA CG (antisense), claudin 10: TGG TGT GTG 
GTG TTG GAG GGT TTG G (sense), TGG AAG GAG CCC AGA GCG TTA 
CCT G (antisense)*"!, 
Single-cell sequencing of sorted myeloid cells of different stages of arthritis. 
Sorted CD45*CD11b*Ly6G™ synovial cells of hind paws of mice at steady state 
(day 0) and at different stages of K/BxN STA (day 1, day 2 and day 5 after serum 
transfer) were subjected to 10x Chromium Single Cell 3’ Solution v2 library 
preparation according to the manufacturer's instructions. Library sequencing was 
performed on an Illumina HiSeq 2500 sequencer to a depth of 100 million reads 
each. Reads were converted to .fastq format using mkfastq from cellranger 2.1.0 
(10x Genomics). Reads were then aligned to the mouse reference genome (mm10, 
Ensembl annotation release 91) including the additional sequence and feature 
annotation for tdTomato. Alignment was performed using the count command 
from cellranger 2.1.0 (10x Genomics). Primary analysis, quality control filtering 
(gene count per cell, unique molecular identifier count per cell, percentage of 
mitochondrial transcripts), clustering, cell-cycle phase scoring based on canonical 
markers and regression, identification of cluster markers and visualization of gene 
expression were performed using the Seurat (v.2.3)*? package for R. 
Construction of single-cell trajectories, identification of genes changing as a 
function of pseudotime and clustering of genes by pseudotemporal expression 
patterns were performed using the Monocle 2 package for R. Pseudotime calcu- 
lations were performed on the top 1,000 differentially expressed genes between 
clusters? For gene ontology enrichment analysis of biological processes, the 
PANTHER Statistical Overrepresentation Test (http://www.pantherdb.org) was 
used. 
Cryo-sectioning of mouse knee joints. Mouse long bones were fixed in 4% 
PFA/PBS (pH 7.4) for 12 h at 4-8 °C, incubated for 10 days in decalcification 
buffer (14% EDTA free acid, NH4OH, pH 7.2) and embedded in OCT Compound 
(Sakura Finetek). A Leica CM 3050 S cryostat and Cryofilm Type 2C(9) 
(C-MK001-A2, Section-Laboratory) were used for the generation of 7-um-thick 
histological sections. 
Histological immunofluorescence staining. For staining cryo-sections of mouse 
knee joints, samples were blocked with rat serum or 0.2% BSA and permeabilized 
with 0.1% saponin in PBS for 1 h at room temperature. For immunofluorescence 
staining, the antibodies listed in Supplementary Table 3 were used. Staining was 
performed for 4 h at room temperature or overnight at 4°C using the indicated 
antibodies diluted in blocking solution. Unbound primary antibodies were washed 
off with DPBS and unlabelled primary antibodies were counterstained with donkey 
anti-Rabbit IgG AF488 or AF647 antibody in blocking solution for 4 h at room 
temperature and washed with DPBS. Joint sections were stained with DAPI or 
SYTOX Blue by incubating samples for 10 min (DAPI) or 1 h (SYTOX Blue) at 
room temperature. Samples were washed three times with DPBS, once with water 
for injection, and embedded onto a coverslip with Dako Fluorescence Mounting 
Medium. 
Bright-field fluorescence microscopy of histological samples. Histological joint 
samples were imaged with an upright Nikon Eclipse Ni-U microscope, using a 
10x (numerical aperture (NA) 0.30), 20x (NA 0.50) or 40x (NA 0.75) CFI Plan 
Fluor objective for varying magnifications. The halogen lamp excitation light 
(of wavelength 2.) as well as the emitted light (Asm) was filtered specifically 
according to individual excitation/emission profiles: DAPI A.x: 390/18 nm and 
Aem: 460/60 nm, fluorescein isothiocyanate (FITC)/AF488 A.x: 475/35 nm and 
em: 530/43 nm, tdTomato/AF594 A,.x: 542/20 nm and Nem: 620/52 nm, AF647 
Aex: 628/40 nm and Aem: 692/40 nm. The generated data were processed with 
Imaris software. 
Confocal laser scanning microscopy of histological samples. For high-mag- 
nification imaging of histological joint sections, a Leica TCS SP 5 II confocal 
microscope with acousto-optic tunable filter and acousto-optical beam splitter, 
and hybrid detector (HyD) on a DMI6000 CS frame was used. Imaging of covers- 
lip-embedded samples was performed using an HCX PL APO 100x oil objective 
with a NA of 1.44. Fluorescence signals were generated via sequential scans, excit- 
ing tdTomato using a diode-pumped solid-state laser at 561 nm and detecting 
with a HyD at 600-650 nm. The second sequence for visualizing Alexa Fluor 488 
or FITC-labelled staining included an argon laser at 488 nm for excitation and a 
HyD detector at 500-550 nm. A third imaging sequence involved a simultaneous 


LETTER 


excitation of SYTOX Blue with a 458-nm argon laser and of Alexa Fluor 647 
staining with a 633-nm helium-neon laser. SYTOX Blue was detected by HyD 
at 470-520 nm and Alexa Fluor signals were detected by HyD at 650-700 nm. 
Generated images were deconvoluted with Huygens Professional and 
3D-reconstructed with Imaris software. 

Spinning disk confocal microscopy of histological samples. For spinning disk 
confocal microscopy of histological joint sections, an inverted Zeiss Spinning 
Disc Axio Observer.Z1 with a Yokogawa CSU-X1M 5000 spinning disk unit, a 
LD C-Apochromat 63 x water immersion objective (NA 1.15) and an Evolve 512 
EMCCD camera was used. Fluorescence signals were excited and detected as fol- 
lows: DAPI Ax: 405 nm DPSS laser and A\em: 445/50 nm BP filter, AF488 A.,: 
488 nm DPSS laser and Aem: 525/50 nm BP filter, tdTomato A,,: 561 nm DPSS laser 
and Aem: 605/70 nm BP filter. Acquired images were processed via Zen Blue 2.3 
image acquisition software. 

Optical clearing of mouse joint samples. Optically cleared samples for light- 
sheet fluorescence microscopy were generated as previously described*. In detail, 
mice received 2.5 jig Ly6G-AF647 or CD31-AF647 in PBS i.v. and were euthanized 
after 1 h. Mice were perfused with 5 mM EDTA/PBS and perfusion-fixed with 4% 
PFA/PBS (pH 7.4). Knee joints were relieved from muscle tissue and post-fixed 
in 4% PFA/PBS (pH 7.4) for 4h at 4-8°C with gentle shaking. Tissue fixation 
was followed by dehydration. Tissue dehydration was performed by increasing 
the proportion of ethanol according to the following series: 50%, 70% and two 
consecutive incubations with 100% ethanol each. The 50% and 70% ethanol 
solutions were generated by diluting 100% ethanol with water for injection, and 
their pH values were adjusted to 9.0 using NaOH. All tissue dehydration steps 
were performed at 4-8 °C in gently shaking 5 ml tubes. After tissue dehydration, 
joint samples were transferred to ethyl cinnamate and incubated at room tem- 
perature for 6h. 

LSFM of optically cleared samples. LSFM of optically cleared mouse knee joints 
was performed with a LaVison BioTec Ultramicroscope II including an Olympus 
MVX10 zoom body (Olympus), a LaVision BioTec Laser Module, and an Andor 
Neo sCMOS Camera with a pixel size of 6.5 jtm. Detection optics with an optical 
magnification range from 1.263 to 12.63 and a NA of 0.5 were used. 

For visualization of general tissue morphology, a 488-nm optically pumped 
semiconductor laser (OPSL) was used to generate autofluorescent signals. For 
tdTomato excitation, a 561-nm OPSL and for CD31-AF647 or Ly6G-AF647 excita- 
tion, a 647-nm diode laser was used. Emitted wavelengths were detected with 
specific detection filters: 525/50 nm for autofluorescence, 620/60 nm for tdTomato, 
and 680/30 nm for CD31-AF647 or Ly6G-AF647. The optical zoom factor of the 
measurements varied from 1.26 to 8 and the light-sheet thickness ranged from 5 
to 10 pm. 

Three-dimensional lining density analysis. The density of the synovial lining was 
analysed by a volumetric ratio of tdTomato* lining macrophages to synovial tissue. 

Three-dimensional reconstruction of LSFM-scanned mouse knee joints was 
performed using Imaris software. The synovial lining was optically separated from 
the joint tissue by manual surface rendering. Volumes of the isolated synovial lining 
tissue and tdTomato* lining cells were fully automatically rendered by the Imaris 
volume rendering tool with a size threshold of 5 1m for tdTomato* cells and 10 pm 
for synovial tissue. The percentage lining density was calculated from the ratio of 
cell and tissue volumes. 

Magnetic resonance imaging. Magnetic resonance imaging (MRI) data were 
acquired using the ClinScan 70/30 7 T MRI System (Bruker) and a RatBrain 
1H-Surface Coil (Bruker). Before measurement, mice were anaesthetized and 
a tail-vein catheter was placed for the injection of contrast agent during meas- 
urement. The body temperature was kept constant with a heating blanket and 
the respiration rate was monitored constantly. Anaesthesia was maintained with 
isoflurane. Dynamic contrast-enhanced (DCE) MRI was conducted using a fast 
low angle shot (FLASH) sequence with repetition time (TR)/echo time (TE): 
2.92 ms/0.88 ms, flip angle: 25°, voxel size: 0.182 x 0.182 x 0.7 mm, matrix 192 x 192, 
acquisition time of 12 min and 100 measurements. The contrast agent (0.1 mmol kg”! 
Gadovist, Bayer) was injected after 40 s over a time period of 10 s. Sagittal and 
transverse T1-weighted images were acquired after running the DCE sequence 
with the following specifications: voxel size: 0.078 x 0.078 x 0.7 mm, TR/TE: 
500 ms/9 ms, matrix 448 x 448. The mean contrast agent enrichment over time 
in the synovial tissue set as region of interest was analysed using Horos software 
(https://horosproject.org/). 

Transmission electron microscopy. Mouse knee joints were fixed in ITO fixation 
solution containing 2.5% glutaraldehyde (Roth, 4157.1), 2.5% paraformaldehyde 
(Roth, 0335.3), 0.1 M cacodylate buffer (Roth, 5169.2) and 0.3% picric acid dis- 
solved in phosphate-buffered saline (pH 7.3) for two days, decalcified in cacodylate 
buffer (0.1 M) containing 14% EDTA for two weeks and finally embedded in Epon. 
Ultra-thin sections (Microtome, Reichert Ultracut S) of 50 nm were contrasted 
with uranyl acetate and lead(11) acetate trihydrate and finally imaged with a trans- 
mission electron microscope (JEM 1400 Plus, Jeol). 
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Statistics and reproducibility. For calculations of statistical significance, GraphPad 
Prism 5 was used. Data are presented as mean + s.e.m. and were analysed using the 
two-sided Student's t-test, the Mann-Whitney U-test or the Kruskal-Wallis H-test 
with Dunn’s multiple comparisons test as post hoc procedure unless stated otherwise. 
P values less than 0.05 were considered significant. LSFM of Fig. 1a is representa- 
tive of ten individual mice. Experiments in Fig. 1b, c were performed three times. 
Bright-field microscopy (BFM), CLSM and LSFM images of Fig. 1d and Extended 
Data Fig. 1c—e are representative images of three experiments with three mice each 
per day. Extended Data Fig. la, fare representative images of three individual mice. 
Flow cytometry experiments in Extended Data Fig. 1a were performed three times. 
Images in Extended Data Fig. 1g, h are representative of at least three individual mice. 
Parabiosis experiments and analysis of Fig. 2a, e, f and Extended Data Fig. 2d-k were 
performed once each per parabiotic combination and time points and images were 
representative of at least three mice. Images in Fig. 2b and Extended Data Fig. 2b, c 
are representative of three individual mice. Images in Extended Data Fig. 2a are 
representative of three individual mice. Flow cytometry experiments in Fig. 2h, j 
and Extended Data Fig. 3f-h were performed once. Images in Fig. 2i and Extended 
Data Fig. 3e are representative of three mice per group. Experiments in Extended 
Data Fig. 3a—c were performed twice. Experiments in Extended Data Fig. 3e are 
representative of three individual mice per day. Bulk RNA-seq analysis of Fig. 3a, b 
and Extended Data Fig. 4a—c was performed once. RNA quantification experiments 
in Extended Data Fig. 4d using the sorting strategy in Extended Data Fig. 4a was 
performed once. Single-cell RNA profiling experiments of sorted cells of Fig. 3c, d 
and Extended Data Figs. 4f-j, 5a—c comprise four different datasets of four individual 
mice at steady state, day 1, day 2 and day 5 after K/BxN serum transfer. Images in 
Extended Data Fig. 4k are representative of three individual mice of experiments 
that are shown in Extended Data Fig. 4], m that were performed once. Images in 
Extended Data Fig. 4n are representative of two individual mice. The experiments 
in Extended Data Fig. 40, p were performed once and show representative images 
of one mouse per group and the corresponding statistics for each mouse. Extended 
Data Fig. 6 shows representative images of four individual mice. Images in Extended 
Data Fig. 7a are representative of four individual mice. Transmission electron micro- 
graphs in Extended Data Fig. 7 are representative of three mice per time point. 
CLSM images in Extended Data Fig. 8a are representative of six mice per time point. 
CLSM of Extended Data Fig. 8b, c and the corresponding lining density analysis 
are representative of two patients with osteoarthritis and three patients with rheu- 
matoid arthritis. Flow cytometry analyses are representative of three patients with 
osteoarthritis and two patients with rheumatoid arthritis. Experiments in Fig. 4a, b 
and Extended Data Fig. 9a were performed twice. MRI experiments of Fig. 4c, fand 
Extended Data Fig. 9k were performed once. BFM images of Fig. 4e are represent- 
ative of 3 mice per group. Mouse experiments of Fig. 4g and Extended Data Fig. 9j 
were performed twice. Mouse experiments of Fig. 4h, i and Extended Data Fig. 9i 
were performed once. Images in Extended Data Fig. 9c, d are representative of three 
individual mice per time point. Images in Extended Data Fig. 9h are representative 
of three individual mice per group. Flow cytometry experiments in Extended Data 
Fig. 9g, | were performed once and confirmed the depletion efficiency. Experiments 
shown in Supplementary Videos 1, 3 and 4 were performed three times, that in 
Supplementary Video 2 was performed once, and those in Supplementary Videos 5, 
6 and 7 were each performed five times. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 
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The data that support the plots within this paper and other findings of this study 
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Extended Data Fig. 1 | See next page for caption. 
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Extended Data Fig. 1 | Spatiotemporal profiling of synovial CX;CR1* 
macrophages. a, BFM of macrophages within the synovial tissue using 
the macrophage markers F4/80 (left and middle; colour as indicated) 
and CD68 (right; green) in Col VI°"R26-tdTomato reporter mice 

(left; tdTomato*, red), Cx3cr1°? mice (middle; GFP*, green), and 
Cx3cr1“°R26-tdTomato, mice (right; tdTomato*, red). Scale bars, 25 jim. 
b, Flow cytometry analysis of macrophages of dissociated hind-paw 
joints of Cx3cr18 mice (n = 3) gated for CD45*, CD11b*, F4/80* and 
GFP. Data are mean + s.e.m. c, Representative 3D LSFM showing the 
spatial distribution of PMNs (Ly6G, green) and mononuclear phagocytes 
(tdTomatot, red) in knee joints of Cx3cr1“°R26-tdTomato mice at 
indicated time points upon induction of K/BxN STA (AK, grey). Filled 
arrowheads point towards the macrophage lining layer to highlight 
changes in its morphology upon induction of STA. Scale bars, 100 xm. 

d, Exemplary BFM images of the synovial membrane of knee joints of 
Cx3cr1™R26-tdTomato mice at day 0, day 2 and 7 after induction of STA. 
Macrophages are defined as tdTomatot (red) and F4/80* (white) and 
infiltrating neutrophils as Ly6G* (green) cells. Scale bars, 25 pm. 

e, Spinning disk confocal microscopy images of the synovial membrane 


of Cx3cr1“°R26-tdTomato mice at day 2 after induction of K/BxN STA 
visualizing macrophages (tdTomato*, red) and neutrophils (Ly6G, green). 
Scale bars, 10 um. f, CLSM scans of the synovial membrane in knee joints 
of ColVI°°R26-tdTomato reporter mice at the indicated time points after 
the induction of STA, enabling the visualization of synovial fibroblasts 
(tdTomato, red) and macrophages (CD68, green) along the synovial cavity 
(sc). Scale bars, 20 jum. g, LSFM of knee joints of Cx3cr1“°R26-tdTomato 
mice showing the spatial distribution of macrophages (tdTomato, red) 
along the synovial cavity at day 21 after the first immunization during 
collagen-induced arthritis before the onset of arthritis (steady state) 

(top) and at day 35 after the first immunization after onset of joint 
inflammation, identifying rearrangement of macrophages in the form 

of palisade-like structures (filled arrowheads). Scale bars, 500 1m (left), 
100 jum (right). AF, grey. h, CLSM images of knee joints of Cx3cr1’R26- 
tdTomato mice at day 21 (top; steady state before onset of arthritis) and 
day 35 (bottom; during active arthritis) of collagen-induced arthritis, 
illustrating reorganization of lining macrophages (tdTomato, red; CD68, 
green). Scale bars, 100 jim; scale bar of magnified view, 10 jum. 
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Extended Data Fig. 2 | Developmental origin of synovial lining 
macrophages. a, Histological CLSM analysis of embryonic mouse knee 
joints at E15.5 and E16.5 visualizing CD68 (red) and F4/80 (white) 
expressing embryonic macrophages (filled arrowheads) within the newly 
formed synovial lining. Scale bars, 50 jum (top), 10 j1m (bottom). 

b, BFM showing expression of CSF1R and the distribution of macrophages 
(F4/80, red) within the synovial tissue of CSF1 R&P mice (GEP, green; 
left). c, Representative CLSM scans of Cx3cr1°? (green) knee joints 
confirming the expression of CSF1R (red) by antibody-mediated 
staining on interstitial CX;CR1~ macrophages. Scale bars, 25 jm, scale 
bar of magnified view, 10 \1m. d, Gating strategy for analysis of synovial 
macrophages isolated from hind paws of parabiotic DsRed/wild-type 
mice. Synovial macrophages were defined as DAPI” living, CD45*, 
Ly6G~, CD11b*, F4/80* cells. DsRed expression discriminates the origin. 
e, Gating strategy for blood monocytes of parabiotic DsRed/wild-type 
mice. Blood monocytes were defined as DAPI” living, CD45*, CD11b*, 
Ly6G~, CD115+ and SSC’, DsRed expression discriminates the origin. 
f, Representative BFM of the synovial membrane of knee joints of a 
wild-type mouse sharing circulation with a DsRed mouse, six weeks after 
establishment of parabiosis (n = 3; DsRed, red; F4/80, green). Scale bars, 
25 um. g, BFM images of parabiotic wild-type (top) and DsRed (bottom) 
mice after nine weeks of parabiosis. In the wild-type mice, DsRed* 
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partner-derived macrophages are visible in the bone marrow (bm), but not 
detected in the macrophage (F4/80, green) lining layer. Scale bars, 25 jim. 
h, Flow-cytometric analysis of the percentage of partner-derived blood 
monocytes and synovial macrophages of DsRed/wild-type parabionts 
after 9 weeks of parabiosis. Mean + s.e.m. Blood, n = 8; synovial joint, 

n = 8.i, Chimerism ratio of blood monocytes and synovial macrophages 
in DsRed/wild-type parabionts after six weeks and nine weeks of 
parabiosis, calculated as the quotient of content of partner-derived 

tissue macrophages to partner-derived blood monocytes. A chimerism 
ratio of one represents the chimerism observed in blood monocytes. 
Mean + s.e.m. Monocytes 6 weeks, n = 6; synovial macrophages 6 weeks, 
n = 6; monocytes 9 weeks, n = 4; synovial macrophages 9 weeks, n = 8. 

j, k, Flow-cytometric analysis of parabiotic hind paws of DsRed/wild-type 
parabionts at the indicated time points of K/BxN serum transfer arthritis. 
Data presented show the percentage of partner-derived PMNs (k) and 
monocytes/macrophages (1) within the blood circulation and the synovial 
tissue and are used to calculate the individual chimerism of tissue- and 
blood-derived cells. Mean + s.e.m. For j, blood day 0 n = 6; synovial tissue 
day 0, n = 6; blood day 5, n = 8; synovial tissue day 5, n = 7. Fork, blood 
day 0, n = 6; synovial tissue day 0, n = 5; blood day 5, n = 7; synovial 
tissue day 5,n = 8. 
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Extended Data Fig. 3 | See next page for caption. 
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Extended Data Fig. 3 | Fate mapping of synovial lining macrophages 
during arthritis. a, Gating strategy for CD45+CD11b*Ly6G-CD115* 
classical Ly6C'" and non-classical Ly6C!” monocytes of 
Cx3cr1°**®R26-tdTomato mice. b, Gating strategy for DAPI” living, 
CD45+CD11b*+Ly6G~ F4/80* macrophages of Cx3cr1°’-®R26-tdTomato 
mice. c, Evaluation of tdTomato expression in blood monocytes 

and synovial macrophages two days and four weeks after tamoxifen 
pulse. Mean + s.e.m.; n = 4 per group. d, e, BFM images of knee joint 
synovial membranes of Cx3cr1°?*tdTomato mice four weeks after 
systemic tamoxifen pulse (d) and five days after local injection of 
(Z)-4-hydroxytamoxifen (e) at day 0 and day 7 after the induction of 
K/BXxN STA showing selective tdTomato (red) expression in synovial 
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lining macrophages. The smaller graphs to the right in e show the 
absence of tdTomato expression in blood monocytes after local (Z)- 
4-hydroxytamoxifen injection. Scale bars, 25 1m. f, Gating strategy 
for DAPI” living, CD45+CD11btLy6G~ F4/80* macrophages of 
Cx3cr1=®R26-tdTomato mice four weeks after tamoxifen pulse, used 
to calculate the absolute numbers of tdTomato* macrophages during 
steady state and K/BxN STA. g, Gating strategy after EdU labelling 

of proliferating macrophages (CD45*CD11b*Ly6G~ F4/80*) of 
Cx3cr1°"®R26-tdTomato mice. h, Quantification of total tdTomatot 
and tdTomato~ macrophages in paws of Cx3cr1“°#8R26-tdTomato mice 
4 weeks upon tamoxifen pulse at day 0, 2 and 5 after induction of STA. 
Mean + s.e.m. Day 0, n = 6; day 2,n = 5; day 5,n = 6. 
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Extended Data Fig. 4 | See next page for caption. 
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Extended Data Fig. 4 | Transcriptional profiling of steady-state synovial 
macrophage subsets. a, Sorting strategy for bulk RNA sequencing 
analysis of synovial macrophages of Cx3cr1°"? mice. Macrophages were 
defined as CD45*, Ly6G~, CD11* and F4/80*. GFP discriminated GFPT 
lining macrophages and GFP interstitial macrophages. b, Hierarchical 
clustering of z-score (left) and log, counts (right) of selected genes of 
sorted GFP* lining macrophages, GFP™ interstitial macrophages and 
BMDMs generated from bulk RNA sequencing. c, Differential gene 
expression (mean fold change, log»(differentially expressed genes) (n = 3 
per group) of tight-junction-associated genes comparing CX3CRI1* lining 
macrophages and BMDMs. Differential expression was performed with 
DESeq2. A Wald test was used to calculate two-sided P values; adjustment 
for multiple comparisons was performed with the Benjamini-Hochberg 
method. *P < 0.05. d, Sorting strategy for synovial macrophages of 
Cx3cr1%"? mice for confirmatory quantitative analysis by PCR with reverse 
transcription (RT-PCR). Macrophages were defined as CD45", Ly6G , 
CD11* and F4/80*+. GFP discriminated GFP* lining macrophages and 
GFP interstitial macrophages. A dump channel using anti-CD31 and 
anti-E-cadherin was integrated to avoid endothelial cell or epithelial cell 
contaminations. e, Confirmatory quantitative RT-PCR analysis in synovial 
macrophage subsets determining expression of mRNAs encoding TJP1 
(BMDM, n = 3; lining macrophage, n = 2), claudin 5 (n = 3 per group) 
and claudin 10 (n = 3 per group) in sorted GFP* lining macrophages 

and in vitro cultured BMDMs, mean + s.e.m.; two-tailed Student’s t-test, 
*P = 0.012. f, t-SNE profile of sorted synovial CD45*CD11b*Ly6G~ 
mononuclear phagocytes of Cx3cr1°"*®R26-tdTomato mice analysed 

four weeks after tamoxifen pulse during steady-state conditions (top). 
After clustering, cell-cycle phase scoring based on canonical markers 

and regression was performed to determine clustering independent of 
cell cycle phase (middle and bottom). n = 7,362 cells. g, Gene ontology 
enrichment analysis of biological processes in cells of the proliferating 
Stmn1* cluster of sorted CD45*CD11b*Ly6G~ mononuclear phagocytes 
of a healthy tamoxifen-pulsed Cx3cr1°"®R26-tdTomato mouse. The 

top 51 cluster marker genes determined with Seurat were used to 

perform a PANTHER overrepresentation test. The list of markers for 
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the Stmn1* cluster was compared to the reference list using Fisher’s 

exact test with false discovery rate correction. h, t-SNE profile of sorted 
synovial CD45+CD11btLy6G~ mononuclear phagocytes of a healthy 
tamoxifen-pulsed Cx3cr1°°"8R26-tdTomato mouse after excluding Acp5* 
osteoclast precursors revealing four remaining clusters (left). Single- 

cell trajectory analysis integrating cluster information (middle) and 
pseudotime (right) show a branch point of cellular differentiation into 
lining macrophages (red) or interstitial Retnlat macrophages (dark blue) 
starting from proliferating MHCII* macrophages (light blue). n = 7,028 
cells. i, Differential gene expression analysis as a function of pseudotime 
in a branch-dependent manner showing a common gene signature of 

a pre-branch precursor cell population choosing two main cell fates: 
either Cx3cr1* lining macrophage or interstitial Retnla* macrophage. 

j, Gene expression changes of selected marker genes as a function of 
pseudotime reflecting the cellular differentiation into Retnla* interstitial 
macrophages (solid line) and Cx3cr1* lining macrophages (dashed line). 
n = 7,028 cells. k, BEM images of knee joints of Csf1r’"8R26-tdTomato 
mice (tdTomato, red) determining tdTomato expression in CD68t 
(green) lining macrophages, MHCII* interstitial macrophages (MHCII, 
white; top) and RELM-at interstitial macrophages (RELM-a, white; 
bottom) at indicated times after the start of tamoxifen treatment. Scale 
bars, 50 j.m. 1, m, Quantification of relative changes in tdTomato* cells 
among CD68* lining macrophages, RELM-a‘* interstitial macrophages 
and MHCII" interstitial macrophages in Csf1r’"®R26-tdTomato mice 

at indicated times after the start of tamoxifen treatment. n = 3 mice per 
group. Data are mean + s.e.m. n, tdTomato (red) expression in CD68 
(green) macrophages in synovial tissue of the knee joint of Retnla"°R26- 
tdTomato mice. Scale bars, 250 ym (left), 25 um (right). 0, p, BFM images 
(o) and quantification of changes (p) in CD68* (red) lining macrophages 
and MHCII* (white) interstitial macrophages in LysM°CD115DTR 
mice after 10 days of DT treatment, at the indicated time points after the 
beginning of DT treatment. Scale bars, 50 zm. n = 3 technical replicates. 
Data are mean + s.e.m. q, Scheme of the postulated dynamic continuum of 
differentiating tissue-resident macrophages within the synovial tissue. 
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Extended Data Fig. 5 | Transcriptional profiling of mononuclear 
phagocytes during arthritis. a, t-SNE scRNA-seq profiles of 

sorted synovial CD45+*CD11btLy6G~ mononuclear phagocytes of 
Cx3cr1?8R26-tdTomato mice analysed four weeks after tamoxifen 

pulse at the indicated time points after the induction of K/BxN STA, 
coloured by cluster assignment and annotated post hoc. Day 1, n = 4,640 
cells; day 2, n = 2,722 cells; day 5, n = 3,237 cells. b, sCRNA-seq-derived 
expression patterns of indicated genes within synovial mononuclear 
phagocytes at indicated time points after the induction of STA. Day 1, 

n = 4,640 cells; day 2, n = 2,722 cells; day 5, n = 3,237 cells. c, t-SNE plots 
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of sorted CD45*CD11b*Ly6G~ cells from arthritic hind paws at day 1 

and day 5 after K/BxN serum transfer, showing the expression of Cx3cr1, 
Axl and Mfge8 within the cluster of lining macrophages. d, Comparison 

of available sCRNA-seq datasets from monocytes of human synovial 

tissue derived from patients suffering from rheumatoid arthritis and 
osteoarthritis* with sCRNA-seq profiles of mouse CD45+CD11b*Ly6G~ 
cells on day 5 after the induction of STA. Values represent the quotient of 
the numbers of all co-expressed marker genes of the 5 macrophage clusters 
at day 5 to the top 20 provided human marker genes of the 4 described 
subpopulations of human monocytes SC-M1, SC-M2, SC-M3 and SC-M4. 
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Extended Data Fig. 6 | Expression patterns of tight-junction proteins connexin 43 (grey, filled arrowheads) in synovial lining tdTomato* (red) 
and gap-junction proteins in tdTomatot lining macrophages. macrophages of Cx3cr1°R26-tdTomato mice during steady state and on 
Expression of claudin 5, TJP1/ZO-1 and claudin 13 as well as that of days 1, 2 and 7 after the induction of K/BxN STA. Scale bars, 5 pm. 
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Extended Data Fig. 7 | See next page for caption. 


Extended Data Fig. 7 | Ultrastructural characterization of cell-cell 
contacts between lining macrophages. a, Representative CLSM of 
macrophages (tdTomato, red) within the synovial membrane of knee 
joints of Cx3cr1"®R26-tdTomato mice, visualizing the tight-junction 
protein ZO-1/TJP1 (white). Phalloidin, green; DAPI, blue. Scale bars, 

5 jum. b, Transmission electron microscopy (TEM) images of the 
synovial membrane of a healthy knee joint showing tight junctions 

(tj), adherens junctions (aj), desmosomes (ds) and interdigitations 
connecting synovial lining macrophages. c, TEM micrograph showing 
synovial lining macrophages (red) constituting a dense physical barrier 
segregating the synovial fluid from sublining interstitial tissue containing 
synovial fibroblasts (cyan), endothelial cells (purple) embedded into the 
extracellular matrix (beige). d, TEM micrograph demonstrating synovial 
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macrophages (red) forming the uppermost cell layer covering the layer of 
synovial fibroblasts (cyan). e, f, TEM micrographs of an inflamed synovial 
membrane two days after induction of K/BxN STA, showing the disruption 
of the covering synovial macrophage (red) layer and a reorientation of 
synovial macrophages (red) and synovial fibroblasts (cyan) directed to the 
synovial cavity. g, A TEM micrograph of an inflamed synovial membrane 
two days after the induction of STA reveals the emergence of macrophages 
containing large amounts of vacuoles filled with phagocytosed material. 

h, i, Recruited monocytes and granulocytes as well as free DNA of 
neutrophil extracellular traps (blue) within the synovial cavity of knee 
joints two days after the induction of STA. Filled arrowheads point at an 
exemplary monocyte engulfing free DNA. 
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Extended Data Fig. 8 | Comparison of mouse and human synovial 
lining macrophages. a, Histological sections of healthy (STA day 0, 
left) and inflamed (STA day 7, right) mouse knee joints of Cx3cr1™°R26- 
tdTomato mice, showing the expression of TREM2 (green; filled 
arrowheads) in lining macrophages (tdTomato, red). Scale bars, 100 jum 
(top), 10 1m (bottom). b, ¢, Histological sections of synovial tissue of 
human knee joints isolated from patients diagnosed with osteoarthritis 
(OA) and rheumatoid arthritis (RA) determining expression of TREM2 
(green; filled arrowheads) (b) and TJP1 (green; filled arrowheads) 

(c) in synovial macrophages (CD68, red). Scale bars, 100 j1m (top), 


TREM2-APC. 


10 sm (bottom). d, Flow-cytometric analysis of the composition and 
frequencies of MHCII*TREM2~ and MHCII- TREM2* mononuclear 
phagocytes in synovial tissue samples isolated from human knee joints 
of patients diagnosed with osteoarthritis and rheumatoid arthritis. 

e, Histology-based quantification of the density of the synovial 
macrophage lining (defined as percentage of CD68*TREM2* 
macrophages among total lining cells) in synovial tissue sections of 
patients diagnosed with osteoarthritis (n = 4) and rheumatoid arthritis 
(n = 5), respectively. Data are mean + s.e.m., two-tailed Student's t-test. 


e 
5 
8 
a - 
B 
g 
a 
= 
8 


24h post injection 


LETTER 


b T1-weighted MRI post 


contrast agent T1-weighted DCE MRI 


DCE curve Normalized DCE curve 
20: 15: 
2 = 
2 = 10 
§ 10 8 
= Es 
<5, Contrast agent < 
injection 
0 0 
0 20 40 6 80 0 20 4 & & 10 
measurment time in sec urments 
e f 
—@® aAnit-cri 
= control Ab 
14 
2 
= 
12 g 
= 
8 
10 
x 
& 
2 8 
8 
£ 6 
(2) 
4 
2 g 
z 
0 
ie) 2 4 6 8 


Days post serum transfer 


ex3er19® CD68, IgG, DAPI Cx3er19fP, CD68, loG, DAPI Cx3ert*”’R26-tdTomato, 
Phalloidin, , DAPI 
9g 
Synovial Blood Blood 96 
Macrophages Lyechigh Monocytes Ly6c!OW Monocytes 
© 
& 400007 100007 g 50005 e@iDTR 
a 300004 8 8000+ 8 40004 © cx3er1""iDTR = 
= > a Ss 
as 8 = 6000+ S 30004 Sytox 
a2 § 200004 8 Ss 
tal 9 ~ 4000-4 = 20004 
3 10000 4 2 2 
g @ 2000+ @ 1000+ = 
o4 of 0 
IgG 
© 150004 300004 4000- 
3 3 3 
g£ fo} fo} 
£ 2 = 30004 
2 a 2 
‘= 2100004 2 200004 = y 
Be 8 3 3 20004 
&3 3 s i=} Sytox 
6 8 50004 “@ 100005 5 10004 
3 o fs o 
2 is) ° 
o4 0 0 = 
CD68, Ly6G, IgG, SYTOX Blue 
ij cre, P j 
DO ieesi Cx3er1°iDTR iDTR j * 802ser auc. «CK ia ; pi 
injection PBS 2 ore _ be “ 154 — Vehicle 
ofDT — K/BxN DT Eo 24s, 56 2 |— csc2 zy 
3 A 3 10 3 
orPBS serum £ g £8. S 5 S 
v v 2 0 36 z Es = 
measurement 51 y 24 z <i < 
oO s 2 5 2 u 
Pe DO swelling 7 % ——— z — ——— 
~3-2-1012345678 OS -3-2-1012345678 SA O.2°4 6.6 D2 6 0 20 40 60 8 20 40 60 80 
Days post induction zg Days post induction eo Days post serum transfer measurement measurement 
| tyechigh Monocytes Ly6c!W Monocytes PMNs 
4000 20000 
eo” ** @ CD115DTR 
3 3000 3 3 15000 © LysM°"°CD115DTR 
3 38 38 
S 2 2 
2 2000 2 = 10000 
8 8 8 
2 s 2 o 
8 1000 } 8 8 5000 
0 0 


Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | Role of CX3CR1* macrophages during arthritis. 
a, To quantify lining density, tdTomatot macrophages (red) were manually 
isolated from 3D reconstructions of optically cleared and LSFM-imaged 
Cx3cr1°R26-tdTomato knee joints (autofluorescence, grey; CD31, blue) 
using Imaris software. Isolated surfaces (yellow) were volume-rendered 
for tdTomato* macrophages (red) and whole-area volume (green). 

Lining density was calculated from the ratio of whole-area volume to 
macrophage volume. An exemplary image of the same knee joint before 
and after isolation of lining macrophages is shown. Scale bars, 200 zm. 

b, Dynamic-contrast-enhanced magnetic resonance imaging (DCE-MRI) 
data analysis. The red line drawn in the sagittal T1-weighted image after 
administration of contrast agent marks the transverse plane used for T1- 
weighted DCE-MRI analysis. The DCE curve generated from the region of 
interest (synovial tissue) was normalized to the measurement time point 
after complete injection of contrast agent, and the time of measurements 
was converted to distinctive measurements. c, CLSM images of knee joints 
of Cx3cr1°' mice injected with protein-G-purified and Alexa-Fluor-647- 
labelled K/BxN serum IgG (grey) at the indicated time points after IgG 
injection, determining the uptake of labelled IgG by macrophages (GFP, 
green; CD68, red) in the synovial tissue and the synovial lining (synovial 
cavity, sc). Scale bars, 10 zm. d, CLSM scan with higher magnification 
showing localization of labelled IgG (grey) inside the vacuoles of CD68* 
(red) lining and interstitial synovial macrophages 24 h after injection. 
Scale bars, 10 um. e, Clinical course of K/BxN STA in wild-type mice 

that were treated with an anti-GRI1 antibody to deplete PMNs and 
inflammatory Ly6C™®" monocytes and a control antibody (LTF-2), one 
day before induction of STA. Mean + s.e.m.; n = 5 per group. 

f, Histological CLSM analysis of lining morphology after anti-GR1 
antibody-mediated neutrophil/monocyte depletion one day after 
induction of STA. Lining macrophages (tdTomato, red). Scale bars, 20 jum. 
g, Flow cytometry analysis of synovial macrophages and blood Ly6C 8" or 


Ly6C* monocytes of Cx3cr1“iDTR mice and iDTR control mice one 
day and five days after two injections of DT (500 ng per mouse per day, 
i.p.). Mean + s.e.m; For day 1: iDTR, n = 5; Cx3cr1°iDTR, n = 7; for day 
5,n = 3 per group. Two-tailed Student’s t-test, ***P < 0.0001. 

h, Representative BFM images of the infiltration of PMNs (Ly6G, green) 
and neutrophil extracellular trap formation (filled arrowheads, DAPI, 
blue) within the synovial cavity of knee joints of Cx3cr1°°iDTR (n = 3) 
and iDTR control (n = 3) mice 6 days after injection of DT and 24 h after 
induction of STA (CD68, red). Scale bars, 200 zm and for magnified view, 
50 uum. i, Treatment scheme and clinical course of STA in Cx3cr1“’iDTR 
and iDTR control mice that had received a unilateral local injection of DT 
(n =7) and PBS (n = 7), respectively. P values calculated using two-tailed 
paired t-test, **P = 0.008. j, Clinical course of STA including AUC of 

the corresponding clinical index in C57BL/6 wild-type mice treated with 
C5C2 claudin peptidomimetics (3.5 jzmol kg, iv., n = 8) or scrambled 
C5C2 control peptide (C5C2scr; 3.5 zmol kg, iv., n = 6) one day before 
and after the induction of STA. Data are mean + s.e.m. Mann-Whitney 
U-test for clinical index with *P < 0.05, and two-tailed Student’s t-test for 
AUC with **P = 0.0062. k, Normalized signal intensity curves of DCE- 
MRI of synovial tissue of knee joints over 90 measurements with intervals 
of 7 s at the indicated days after STA in C57BL/6 wild-type mice treated 
with C5C2 claudin peptidomimetics (3.5 jumol kg, i.v.) or vehicle one 
day before the induction of STA. Data are mean + s.e.m. Day 0: vehicle, 
n= 10 knee joints; C5C2, n = 10 knee joints; day 1: vehicle, n = 9 knee 
joints; C5C2, n = 10 knee joints. P values for AUC were calculated using 
two-tailed Student's t-test, *P = 0,0256. 1, Flow cytometry of blood 
monocytes and neutrophils of LysM"°CD115DTR mice (n = 6) and 
CD115DTR control mice (n = 5) one day after two injections of DT 

(500 ng per mouse per day, i.p.). Mean + s.e.m.; two-tailed Student's t-test, 
*P = 0.0467, **P < 0.0069, ***P = 0.0001. 
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Extended Data Fig. 10 | Schematic summary. Top, scheme of the lining macrophages form a protective barrier for joint structures that 
postulated origin of resident synovial CX3CR1* lining macrophages that disintegrates during arthritis, enabling infiltration of inflammatory 
constantly repopulate from proliferating tissue resident CX3CR17> MHCII* _ myeloid cells. 

interstitial macrophages. Bottom, tight-junction-forming resident 
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Statistical parameters 


When statistical analyses are reported, confirm that the following items are present in the relevant location (e.g. figure legend, table legend, main 
text, or Methods section). 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


An indication of whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistics including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND 
variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


| For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Clearly defined error bars 
State explicitly what error bars represent (e.g. SD, SE, Cl) 


Our web collection on statistics for biologists may be useful. 


Software and code 


Policy information about availability of computer code 


Data collection Leica TCS SP 5 Il CLSM data were collected using Las AF software version 2.7.3.9723. 
Spinning Disc Confocal Microscopy (SDCM) was eprfomred by using a Zeiss Spinning Disc Axio Observer.Z1 and images were collected via 
the Zen Blue image aquisition software Version 2.3. 
Light sheet fluorescence microscopy (LSFM) data were generated using an Ultramicrospe II (LaVision BioTech GmbH) and collected with 
mspector software Version 5.1.304. 

RI scans were performed using a ClinScan 70/30 7 Tesla MRI System (Bruker). 
Flow cytometry data and cell sorting data were aquired by using a CytoFLex S (Beckman Coulter) and a MoFlo XDP (Beckman Coulter) 
system. 


ndividual settings for data acquisitions via the systems listed above are described in detail in the experimental procedures. 


Data analysis CLSM and SDCM data were processed and analysed using huygens professional software Version 17.10 (Scientific Volume Imaging), 
maris software Version 9.1 (Bitplane) and Image J software Version 1.8.0_112. 

LSFM data were processed using Imaris software Version 9.1 (Bitplane) and Image J software version 1.8.0_112. MRI data were 
processed via Horos LGPL Version 3.0. 

Flow cytometry data and cell sorting data were analyzed via the Summit Software System, MacsQuantify (Miltenyi Biotec, Version 2.5), 
CytExpert (Beckman Coulter, Version 2.2.0.97), FlowJo (FlowJo, Version 7.6.5), and Kaluza software (Beckman Coulter, Version 1.5a). 
Statistical data analysis was performed with GraphPad Prism 5. 


Differential expression analysis was performed with the DESeq2 package v.1.20.0 and volcano plots were generated with ggplot2 
package in R software. 

For single cell RNA sequencing analysis CellRanger 2.1.0, FastQC v0.11.7, the Seurat (Version 2.3) package for R and Monocle 2 package 
were used. 


All data analysis strategies and softwares are described prescisely in the experimental procedures and supplementary table 3. 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers 
upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


The data that support the findings of this study are available on reasonable request from the corresponding author [G.K, S.C., A.G.]. The data are not publicly 
available due to comprised information that could compromise research participant privacy. 
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Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size Sample size was determined by statistical power analysis including high signficance levels (p < 0.05). For calculation of statistical significance 
GraphPad Prism 5 was used. Data are presented as mean + SEM and were analyzed using Student’s t-test, Mann-Whitney U-Test, or Kruskal- 
Wallis H-Test with Dunn’s multiple comparisons test as post hoc procedure. P values less than 0.05 were considered significant. 
Differential expression analysis for Bulk RNA sequencing has been performed with DESeq2. Wald test was used to calculate two-sided p- 
values; adjustment for multiple comparisons was performed with the Benjamini-Hochberg method. For PANTHER Overrepresentation Test 
Fisher’s exact Test with False Discovery Rate correction. Cluster markers of SingleCell Sequencing data sets were identified using the Wilcoxon 
Rank Sum test. Adjusted p-values based on Bonferroni correction using all genes in the dataset. 


Data exclusions No data were excluded from analysis. 
Replication Minimum three independent measurements per experiment were performed and successfully confirmed results. 
Randomization — Experimental groups were randomly allocated. Treated groups were housed together with control groups. 


Blinding Data analysis was performed in a blinded fashion. The results were confirmed by two investigators, who analyzed the blindet data 
independently. 
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aterials & experimental systems ethods 


n/a | Involved in the study 


Antibodies 


Eukaryotic cell lines 


Palaeontology 


n/a | Involved in the study 


Unique biological materials ChIP-seq 


Flow cytometry 


MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Unique biological materials 


Policy information about availability of materials 


Obtaining unique materials 


Antibodies 


All unique biological materials (genetically modified mouse strains) are available from standard commercial sources: 


- C57BL/6) 
Charles River 
# 632 


- C57BL/6Rj 
Janvier Labs 


- Tg(Cx3cr1-cre)MW126Gsat/Mmucd MMRRC 
# 036395-UCD 


- B6;129S6-Gt(ROSA)26Sortm9(CAG-tdTomato)Hze/J The Jackson Laboratory 
# 007905 


- STOCK Tg(ACTB-DsRed* MST)1Nagy/J, DsRed.T3 The Jackson Laboratory 
#005441 


- B6.129P2(C)-Cx3critm2.1(cre/ERT2)Jung/J The Jackson Laboratory 
# 020940 


- FVB-Tg(Csfir-cre/Esr1*)Jwp/J The Jackson Laboratory 
#019098 


- C57BL/6-Gt(ROSA)26Sortm1(HBEGF)Awai/1 
The Jackson Laboratory 


# 007900 


- B6;129S6-Gt(ROSA)26Sortm9(CAG-tdTomato)Hze/J The Jackson Laboratory 
# 007905 


- C57BL/6-Tg(Csf1r-HBEGF/mCherry)1Mnz/J The Jackson Laboratory 
# 024046 


- B6.129P-Cx3critm1Litt/J The Jackson Laboratory 
# 005582 


All used mouse strains are additionally listed in supplementary table 1. 


Antibodies used 


Antibodies used in this study are: 


- ApoE 
unconjugated, Thermo Fisher, Cat: 701241, Lot: 1984882, Clone: 16H22L18, Dilution: 1:200 


- CD1c-PerCP/Cy5.5 
BioLegend, Cat: 331513, Lot: B267879, Clone: L161, Dilution: 1:500 


- CD11b-PE/Cy7 
BioLegend, Cat: 101216, Lot: B185646, Clone: M1/70, Dilution: 1:500 


- CD11b-Alexa Fluor 488 
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BioLegend, Cat: 393107, Lot: B261594, Clone: LM2, Dilution: 1:500 


- CD 14 PE-Cy7 
BioLegend, Cat: 367111, Lot: B252403, Clone: 63D3, Dilution: 1:500 


- CD15-Brilliant Violet 421 
BioLegend, Cat: 232039, Lot: B263781, Clone: W6D3, Dilution: 1:500 


- CD20-Brilliant Violet 421 
BioLegend, Cat: 302329, Lot: B257594, Clone: 2H7, Dilution: 1:500 


- CD31-PE 
Biolegend, Cat:102507, Lot: B129965, Clone: MEC13.3, Dilution: 1:500 


- CD31-Alexa Fluor 647 
Biolegend, Cat: 102516, Lot: B234197, Clone: MEC13.3, Dilution: 2.5ug/mouse 


- CD45-BrilliantViolet 421, 
Biolegend, Cat:103133, Lot: B263588, Clone: 30-F11, Dilution: 1:500 
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- CD45-Alexa Fluor 700, 
Biolegend, Cat: 368513, Lot: B248833, Clone: 2D1, Dilution: 1:500 


- CD45.2-Alexa Flour 700 
BioLegend, Cat: 109822, Lot: B202497, Clone: 104, Dilution: 1:500 


- CD68-Alexa Fluor 594 
BioLegend, Cat: 137020, Lot: B239125, Clone: FA-11, Dilution: 1:400 


- CD68-Alexa Fluor 647 
BioLegend, Cat: 137004, Lot: B153907, Clone: FA-11, Dilution: 1:400 


- CD68 unconjugated 
BioLegend, Cat: 333801, Lot: B200949, Clone: Y1/82A, Dilution: 1:200 


- CD68 unconjugated 
Abcam, Cat: ab955, Lot: GR3230929-1, Clone: KP1, Dilution: 1:200 


- CD68 -Alexa Fluor 594 
R&D systems, Cat: 1C20401T, Lot: 1471045, Clone: 298807, Dilution: 1:200 


- CSF1R-APC 
Biolegend, Cat: 135510, Lot: B183456, Clone: AFS98, Dilution: 1:500 


- CSF1R-Alexa Fluor 647 
Biolegend, Cat: 135530, Lot: , Clone: AFS98, Dilution: 1:200 


- Claudin 2 unconjugated 
Abcam, Cat: ab53032, Lot: GR314368-11, Clone: polyclonal, Dilution: 1:200 


- Claudin 5 unconjugated 
Abcam, Cat: ab15106, Lot: GR3182385, Clone: polyclonal, Dilution: 1:200 


- Claudin 13 unconjugated 
Invitrogen, Cat: PA1-24420, Lot: TA2507851, Clone: polyclonal, Dilution: 1:200 


- Connexin 43 unconjugated 
Sigma Aldrich, Cat: C6219, Lot: 027144804V, Clone: polyclonal, Dilution: 1:200 


- Donkey anti-Rabbit IgG Alexa Fluor 647 
Life Technologies, Cat: A-31573, Lot: 1563697, Clone: polyclonal, Dilution: 1:200 


- Donkey anti-Rabbit IgG Alexa Fluor 488 
Life Technologies, Cat: A-21206, Lot: 1644644, Clone: polyclonal, Dilution: 1:200 


- E-Cadherin-PE, 
Biolegend, Cat: 147303, Lot: B260705, Clone: DECMA-1, Dilution: 1:500 


- F4/80-Alexa Fluor 647, 
BioLegend, Cat: 123122, Lot: B212680, Clone: BM8, Dilution: 1:400 


- F4/80-FITC, 
BioLegend, Cat: 123108, Lot: B177257, Clone: BM8, Dilution: 1:400 


- HLA-DR PE, 


Validation 


BioLegend, Cat: 361605, Lot: B261328, Clone: TU36, Dilution: 1:500 


- Ki67-Af647, 
BioLegend, Cat: 652407, Lot: B238782, Clone: 16A8, Dilution: 1:200 


- Ly6C-Alexa Flour 488, 
ioLegend, Cat: 128022, Lot: B248739, Clone: HK1.4, Dilution: 1:400 


ee 


- Ly6G-Brilliant Violet, 
BioLegend, Cat: 127627, Lot: B193096, Clone: 1A8, Dilution: 1:400 


- Ly6G-FITC, 
BioLegend, Cat: 127606, Lot: B175677, Clone: 1A8, Dilution: 1:400 


- Ly6G-Alexa Fluor 488, 
BioLegend, Cat: 127626, Lot: B240194, Clone: 1A8, Dilution: 1:400 


- Ly6G-Alexa Fluor 647, 
ioLegend, Cat: 127610, Lot: B204928, Clone: 1A8, Dilution: 1:200 


ee 


- MHC II-PE, 
Biolegend, Cat: 107608, Lot: B130064, Clone: M5/114.15.2, Dilution: 1:200 


Relm alpha 
Abcam, Cat: ab39626, Lot: GR1287151, Clone: polyclonal, 1:200 


- Trem 2 unconjugated 
Abcam, Cat: ab86491, Lot: GR3207091-11, Clone: RM0139-5J46, Dilution: 1:200 


- Trem2-APC, 
R&D systems, Cat: FAB17291A, Lot: AADSO17111, Clone: 237920, Dilution: 1:500 


- ZO-1 unconjugated, EMD Millipore, Cat: AB2272, Lot: 2905383, Clone: polyclonal, Dilution: 1:100 


All used antibodies are listed in more detail in supplementary table 2, including information regarding antigen, conjugation, 
concentration, isotype, host reactivity, clone, source, Cat#, Lot#, dilution, and application. 


Exclusively comercially available antibodies were used. Antibody specivicity, concentration and quality validation were 
performed by the manufacturers. Validation statements of the manufacturers can be found on their webpages: 


- Abcam: https://www.abcam.com/primary-antibodies/improving-reproducibility-with-better-antibodies 

-Biolegend: https://www.biolegend.com/reproducibility 

- Invitrogen: https://www.thermofisher.com/de/de/home/life-science/antibodies/invitrogen-antibody-validation.html 

- LifeTechnologies: https://www.thermofisher.com/de/de/home/life-science/antibodies/invitrogen-antibody-validation/ 
independent-antibody-validation.html 

- R&D Systems: https://www.rndsystems.com/tags/antibody-validation 

- Sigma-Aldrich: https://www.sigmaaldrich.com/technical-documents/articles/biology/antibody-standard-validation.html 


- ApoE, unconjugated, Thermo Fisher, Cat: 701241 
https://assets.thermofisher.com/TFS-Assets/LSG/certificate/Certificates-of-Analysis/701241_1984882.PDF 


- CD1c-PerCP/Cy5.5, BioLegend, Cat: 331513 
https://www.biolegend.com/en-us/global-elements/pdf-popup/percp-cyanine5-5-anti-human-cd1c-antibody-5182? 
filename=PerCPCyanine55%20anti-human%20CD1c%20Antibody.pdf&pdfgen=true 


- CD11b-PE/Cy7, BioLegend, Cat: 101216 
https://www.biolegend.com/en-us/global-elements/pdf-popup/pe-cy7-anti-mouse-human-cd11b-antibody-1921? 
filename=PECy7%20anti-mousehuman%20CD11b%20Antibody.pdf&pdfgen=true 


- CD11b-Alexa Fluor 488, BioLegend, Cat: 393107 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-488-anti-human-cd11b-antibody-16010? 
filename=Alexa%20Fluor%20488%20anti-human%20CD11b%20Antibody.pdf&pdfgen=true 


- CD 14 PE-Cy7, BioLegend, Cat: 367111 
https://www.biolegend.com/en-us/global-elements/pdf-popup/pe-cy7-anti-human-cd14-antibody-12794 ?filename=PECy7% 
20anti-human%20CD14%20Antibody.pdf&pdfgen=true 


- CD15-Brilliant Violet 421, BioLegend, Cat: 232039 falsch! ? 323039 
https://www.biolegend.com/en-us/global-elements/pdf-popup/brilliant-violet-421-anti-human-cd15-ssea-1-antibody-12371? 
filename=Brilliant%20Violet%20421%20anti-human%20CD15%20SSEA-1%20Antibody.pdf&pdfgen=true 


- CD20-Brilliant Violet 421, BioLegend, Cat: 302329 
https://www.biolegend.com/en-us/global-elements/pdf-popup/brilliant-violet-421-anti-human-cd20-antibody-7192? 
filename=Brilliant%20Violet%20421%20anti-human%20CD20%20Antibody.pdf&pdfgen=true 
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- CD31-PE, Biolegend, Cat:102507 
https://www.biolegend.com/en-us/global-elements/pdf-popup/pe-anti-mouse-cd31-antibody-379 ?filename=PE%20anti-mouse 
%20CD31%20Antibody.pdf&pdfgen=true 


- CD31-Alexa Fluor 647, Biolegend, Cat: 102516 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-647-anti-mouse-cd31-antibody-3094 ?filename=Alexa 
%20Fluor%20647%20anti-mouse%20CD31%20Antibody.pdf&pdfgen=true 


- CD45-BrilliantViolet 421, Biolegend, Cat:103133 
https://www.biolegend.com/en-us/global-elements/pdf-popup/brilliant-violet-421-anti-mouse-cd45-antibody-7253? 
filename=Brilliant%20Violet%20421%20anti-mouse%20CD45%20Antibody.pdf&pdfgen=true 


- CD45-Alexa Fluor 700, Biolegend, Cat: 368513 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-700-anti-human-cd45-antibody-12399? 
filename=Alexa%20Fluor%20700%20anti-human%20CD45%20Antibody.pdf&pdfgen=true 


- CD45.2-Alexa Flour 700, BioLegend, Cat: 109822 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-700-anti-mouse-cd45-2-antibody-3393? 
filename=Alexa%20Fluor%20700%20anti-mouse%20CD452%20Antibody.pdf&pdfgen=true 

- CD68-Alexa Fluor 594, BioLegend, Cat: 137020 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-594-anti-mouse-cd68-antibody-9671 ?filename=Alexa 
%20Fluor%20594%20anti-mouse%20CD68%20Antibody.pdf&pdfgen=true 
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- CD68-Alexa Fluor 647, BioLegend, Cat: 137004 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-647-anti-mouse-cd68-antibody-6422 ?filename=Alexa 
%20Fluor%20647%20anti-mouse%20CD68%20Antibody.pdf&pdfgen=true 

- CD68 unconjugated, BioLegend, Cat: 333801 
https://www.biolegend.com/en-us/global-elements/pdf-popup/purified-anti-human-cd68-antibody-4835 ?filename=Purified% 
20anti-human%20CD68%20Antibody.pdf&pdfgen=true 


- CD68 unconjugated, Abcam, Cat: ab955 
https://www.abcam.com/cd68-antibody-kp1-ab955.html?productWallTab=Questions 


- CD68 -Alexa Fluor 594, R&D systems, Cat: 1C20401T https://resources.rndsystems.com/pdfs/datasheets/ic20401t.pdf 
- CD115-APC, BioLegend, Cat: 135510 
- Claudin 2 unconjugated, Abcam, Cat: ab53032 https://www.abcam.com/claudin-2-antibody-ab53032.html 


- Claudin 5 unconjugated, Abcam, Cat: ab15106 https://www.abcam.com/claudin-5-antibody-ab15106.html 


- Claudin 13 unconjugated, Invitrogen, Cat: PA1-24420 
https://www.thermofisher.com/document-connect/document-connect.html?url=https%3A%2F%2Fassets.thermofisher.com% 
2FTFS-Assets%2FLSG%2Fcertificate%2FCertificates-of-Analysis% 
2FMA191114_TA2507851.PDF&title=TG9zLUS5yLiZuYnNwO1RBMjUWNzg1MQ== 


- Connexin 43 unconjugated, Sigma Aldrich, Cat: C6219 https://www.sigmaaldrich.com/content/dam/sigma-aldrich/docs/Sigma/ 
Datasheet/3/c6219dat.pdf 


- CSF1R-APC, Biolegend, Cat: 135510 
https://www.biolegend.com/en-us/global-elements/pdf-popup/apc-anti-mouse-cd115-csf-1r-antibody-6336 ?filename=APC% 
20anti-mouse%20CD115%20CSF-1R%20Antibody.pdf&pdfgen=true 


- CSF1R-Alexa Fluor 647, Biolegend, Cat: 135530 
ps://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-647-anti-mouse-cd115-csf-1r-antibody-12485? 
filename=Alexa%20Fluor%20647%20anti-mouse%20CD115%20CSF-1R%20Antibody.pdf&pdfgen=true 


a 


- Donkey anti-Rabbit IgG Alexa Fluor 647, Life Technologies, Cat: A-31573 
ps://assets.thermofisher.com/TFS-Assets/LSG/certificate/Certificates%200f%20Analysis/1563697_A31573.pdf 


a 


- Donkey anti-Rabbit IgG Alexa Fluor 488, Life Technologies, Cat: A-21206 
https://www.thermofisher.com/document-connect/document-connect.html?url=https%3A%2F%2Fassets.thermofisher.com% 
2FTFS-Assets%2FLSG%2Fcertificate%2FCertificates-of-Analysis% 
2F1644644_A21202.pdf&title=TG9ZLU5yLiZuUYnNwOzE2NDQ2NDQ= 


-E-Cadherin-PE, Biolegend, Cat: 147303 
https://www.biolegend.com/en-us/global-elements/pdf-popup/pe-anti-mouse-human-cd324-e-cadherin-antibody-9276? 
filename=PE%20anti-mousehuman%20CD324%20E-Cadherin%20Antibody.pdf&pdfgen=true 


- F4/80-Alexa Fluor 647, BioLegend, Cat: 123122 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-647-anti-mouse-f4-80-antibody-4074? 
filename=Alexa%20Fluor%20647%20anti-mouse%20F480%20Antibody.pdf&pdfgen=true 


- F4/80-FITC, BioLegend, Cat: 123108 
ps://www.biolegend.com/en-us/global-elements/pdf-popup/fitc-anti-mouse-f4-80-antibody-4067 ?filename=FITC%20anti- 


a 


mouse%20F480%20Antibody.pdf&pdfgen=true 

HLA-DR PE, BioLegend, Cat: 361605 
tps://www.biolegend.com/en-us/global-elements/pdf-popup/pe-anti-human-hla-dr-antibody-9390?filename=PE%20anti- 
human%20HLA-DR%20Antibody.pdf&pdfgen=true 


= 


i67-AF647, BioLegend, Cat: 652407 https://www.biolegend.com/Default.aspx?ld=18921 


- Ly6C-Alexa Flour 488, BioLegend, Cat: 128022 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-488-anti-mouse-ly-6c-antibody-6756 ?filename=Alexa 
%20Fluor%20488%20anti-mouse%20Ly-6C%20Antibody.pdf&pdfgen=true 

- Ly6G-Brilliant Violet, BioLegend, Cat: 127627 
https://www.biolegend.com/en-us/global-elements/pdf-popup/brilliant-violet-421-anti-mouse-ly-6g-antibody-7161? 
filename=Brilliant%20Violet%20421%20anti-mouse%20Ly-6G%20Antibody.pdf&pdfgen=true 


- Ly6G-FITC, BioLegend, Cat: 127606 
https://www.biolegend.com/en-us/global-elements/pdf-popup/fitc-anti-mouse-ly-6g-antibody-4775 ?filename=FITC%20anti- 
mouse%20Ly-6G%20Antibody.pdf&pdfgen=true 
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- Ly6G-Alexa Fluor 488, BioLegend, Cat: 127626 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-488-anti-mouse-ly-6g-antibody-7085 ?filename=Alexa 
%20Fluor%20488%20anti-mouse%20Ly-6G%20Antibody.pdf&pdfgen=true 


- Ly6G-Alexa Fluor 647, BioLegend, Cat: 127610 
https://www.biolegend.com/en-us/global-elements/pdf-popup/alexa-fluor-647-anti-mouse-ly-6g-antibody-4780?filename=Alexa 
%20Fluor%20647%20anti-mouse%20Ly-6G%20Antibody.pdf&pdfgen=true 


- MHCII-PE, Biolegend, Cat: 107608 
https://www.biolegend.com/en-us/global-elements/pdf-popup/pe-anti-mouse-i-a-i-e-antibody-367 ?filename=PE%20anti-mouse 
%20I-Al-E%20Antibody.pdf&pdfgen=true 


- Trem 2 unconjugated, Abcam, Cat: ab86491 https://www.abcam.com/trem2-antibody-rm0139-5j46-ab86491.html 


- Trem2-APC, R&D systems, Cat: FAB17291A https://resources.rndsystems.com/pdfs/datasheets/fab17291a.pdf 
- ZO-1 unconjugated, EMD Millipore, Cat: AB2272 


http://www.merckmillipore.com/DE/de/product/Anti-ZO-1-Antibody, MM_NF-AB2272?ReferrerURL=https%3A%2F% 
2Fwww.google.com%2F#documentation 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals In this study the following mouse lines were used: 


- C57BL/6 
Strain: C57BL/6J, Source: Charles River, Identifier: 632 


- C57BL/6 
Strain: C57BL/6JRj, Source: Janvier Labs 


- Cx3cricre:R26-tdTomato 
Strain: Tg(Cx3cr1-cre)MW126Gsat/Mmucd, Source: MMRRC, Identifier: 036395-UCD 


- DsRed 
Strain: STOCK Tg(ACTB-DsRed*MST)1Nagy/J, DsRed.T3, Source: The Jackson Laboratory, Identifier: 005441 


- Cx3cricreER 
Strain: B6.129P2(C)-Cx3cr1tm2.1(cre/ERT2)Jung/J, Source: The Jackson Laboratory, Identifier: 020940 


- CSF1RcreER 
Strain: FVB-Tg(Csfir-cre/Esr1*)1Jwp/J, Source: The Jackson Laboratory, Identifier: 019098 


- iDTR 
Strain: C57BL/6-Gt(ROSA)26Sortm1(HBEGF)Awai/J, Source: The Jackson Laboratory, Idnetifier: 007900 


- tdTomato 
Strain: B6;129S6-Gt(ROSA)26Sortm9(CAG-tdTomato)Hze/J, Source: The Jackson Laboratory, Identifier: 007905 


-CD115DTR 
Strain: C57BL/6-Tg(Csfir-HBEGF/mCherry)1Mnz/J, Source: The Jackson Laboratory, Identifier: 024046 


- Cx3crigfp 
Strain: B6.129P-Cx3cr1tm1Litt/J, Source: The Jackson Laboratory, Identifier: 005582 


All mice were housed under "specific pathogen-free" (SPF) conditions at the animal facilities of the University of Erlangen, 
Germany. Male and female mice at an age of 8-18 weeks were used. 


Wild animals This study does not include wild animals. 


Field-collected samples This study does not include field-collected samples. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics Synovial biopsies were obtained from knee joints of patients diagnosed with osteoarthritis (OA) and rheumatoid arthritis (RA), 
respectively. RA patients fulfilled the 2010 EULAR/ACR criteria of RA. All patients were = 18years of age. 


Recruitment OA patients were recruited at the Department of Trauma Surgery, Universitatsklinikum Erlangen and RA patients were recruited 
at the Department of Internal Medicine 3 - Rheumatology and Immunology, Universitatsklinikum Erlangen. All patients signed an 
informed consent. 
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Flow Cytometry 


Plots 


Confirm that: 


The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 


The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a ‘group’ is an analysis of identical markers). 


All plots are contour plots with outliers or pseudocolor plots. 


A numerical value for number of cells or percentage (with statistics) is provided. 


ethodology 

Sample preparation Sample preparation is described in detail in the supplemental experimental procedures. 

Instrument Flow cytometry was performed with a CytoFLex S, Beckman Coulter. Sorting of cells was performed with a MoFlo XDP, Beckman 
Coulter. 

Software Flow cytometry data and cell sorting data were analyzed via the Summit Software System, CytExpert, FlowJo, and Kaluza 
software. 


Cell population abundance Purity of sorted cells was confirmed by flow cytometry analysis. 


Gating strategy FACS strategies are provided in detail in the extended data and supplementary experimental procedures. 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 


LETTER 


https://doi.org/10.1038/s41586-019-1472-0 


BORIS promotes chromatin regulatory interactions 
in treatment-resistant cancer cells 


David N. Debruyne!*!, Ruben Dries!?+!5, Satyaki Sengupta? 


, Davide Seruggia!*+>°, Yang Gao!?, Bandana Sharma! 


be. 


Hao Huang!?, Lisa Moreau’, Michael McLane, Daniel S. Day®: 9. Eugenio Marco*”, Ting Chen", Nathanael S. Gray! 113) 
Kwok-Kin Wone™, Stuart H. Orkin!*>.°, Guo-Cheng Yuan*°, Richard A. Young®? & Rani E. Georgeb?4* 


The CCCTC-binding factor (CTCF), which anchors DNA loops that 
organize the genome into structural domains, has a central role in 
gene control by facilitating or constraining interactions between 
genes and their regulatory elements!”. In cancer cells, the disruption 
of CTCF binding at specific loci by somatic mutation** or DNA 
hypermethylation’ results in the loss of loop anchors and consequent 
activation of oncogenes. By contrast, the germ-cell-specific 
paralogue of CTCF, BORIS (brother of the regulator of imprinted 
sites, also known as CTCFL)*®, is overexpressed in several cancers’~* 

but its contributions to the malignant phenotype remain unclear. 
Here we show that aberrant upregulation of BORIS promotes 
chromatin interactions in ALK-mutated, MYCN-amplified 


neuroblastoma” cells that develop resistance to ALK inhibition. 
These cells are reprogrammed to a distinct phenotypic state 
during the acquisition of resistance, a process defined by the initial 
loss of MYCN expression followed by subsequent overexpression 
of BORIS and a concomitant switch in cellular dependence from 
MYCN to BORIS. The resultant BORIS-regulated alterations in 
chromatin looping lead to the formation of super-enhancers that 
drive the ectopic expression of a subset of proneural transcription 
factors that ultimately define the resistance phenotype. These results 
identify a previously unrecognized role of BORIS—to promote 
regulatory chromatin interactions that support specific cancer 
phenotypes. 


Fig. 1 | Targeted therapy resistance in 
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Unlike CTCF, which is uniformly expressed in healthy tissues and 
cancer cells, the expression of BORIS is typically restricted to the testis® 
and embryonic stem cells'! (Extended Data Fig. 1a). However, when 
aberrantly expressed in cancer’, it is associated with high-risk features 
that include resistance to treatment (Extended Data Fig. 1b, c). We 
identified BORIS as one of the most differentially expressed genes in 
neuroblastoma cells driven by amplified MYCN” and ALK(F1174L)° 
and rendered resistant to ALK inhibition. Kelly human neuroblastoma 
cells were exposed to increasing concentrations of the ALK inhibitor 
TAE684" until stable resistance was achieved (Fig. 1a, Extended Data 
Fig. 2a-d). The acquisition of stable resistance coincided not only with 
the loss of ALK phosphorylation—which indicates that the cells no 
longer required activation of this receptor tyrosine kinase to maintain 
their oncogenic properties—but also with the absence of other common 
instigators of resistance (Extended Data Fig. 2a, e-h; Supplementary 
Note 1). However, comparison of the gene expression profiles of the 
TAE684-sensitive and resistant cells showed generalized downregula- 
tion of transcription in the resistant cells, but with marked upregulation 
ofa subset of transcription factors not typically associated with neuro- 
blastoma cells!*'° (Fig. 1b). 

We therefore proposed that the resistant cells had probably under- 
gone transcriptional reprogramming during the development of 
resistance. To determine the dynamics of resistance development, we 
performed single-cell RNA sequencing (scRNA-seq) analysis on sensi- 
tive, intermediate and fully resistant cell states (Extended Data Fig. 3a). 
Principal component analysis (PCA) indicated a stepwise transition 
as cells progressed from the sensitive to the fully resistant state (Fig. 1c). 
This transition was confirmed by distributed stochastic neighbour 
embedding (t-SNE)!’, which clustered the cells into three non- 
overlapping categories (Extended Data Fig. 3b, c). Pseudotime analysis 
based on the transcription factors that were differentially expressed 
throughout the development of resistance revealed that the initial major 
alteration was loss of MYCN expression, which persisted in stably resist- 
ant cells (Fig. 1d, Extended Data Fig. 3d, e). To understand this unex- 
pected result, we analysed the status of MYCN in these cells, and found 
that although genomic amplification was retained, the MYCN locus 
was epigenetically repressed (Extended Data Fig. 3f, g). This state was 
accompanied by a genome-wide reduction of MYCN binding to DNA 
and a consequent revision of associated downstream transcription out- 
comes!>'®1° (Fig. le, Extended Data Fig. 3h). Coincident with this loss 
of transcriptional activity, the resistant cells were no longer dependent 
on MYCN for survival, unlike their sensitive controls, which underwent 
apoptosis after depletion of MYCN (Extended Data Fig. 3i). Subsequent 
resistance stages were defined by a gradual increase in the expression 
of the neural developmental markers SOX2 and SOX9”°, followed by 
upregulation of BORIS, ultimately leading to a fully resistant state in 
which BORIS expression was highest and detectable in essentially all 
cells (Fig. 1d, Extended Data Fig. 3), k). Overexpression of BORIS, 
which coincided with promoter hypomethylation (Extended Data 
Fig. 4a, b), was also observed in additional neuroblastoma cell lines 
rendered resistant to TAE684 (SK-N-SH) or the CDK12 inhibitor E9”! 
(SK-N-BE(2)) (Extended Data Fig. 4c, d), which suggests that our find- 
ings are not restricted to a single cell line or kinase inhibitor. Indeed, 
overexpression of BORIS in tumours was significantly associated with 
high-risk disease and a poor outcome in patients with neuroblastoma 
treated with a variety of regimens (Extended Data Fig. 4e-g). 

To clarify the role of BORIS in the resistance phenotype, we depleted 
its expression in resistant cells, and observed a partial reversal to the 
sensitive-cell state with re-emergence of MYCN and ALK expression 
(Fig. 1f, Extended Data Fig. 5a-c). However, this outcome was insuf- 
ficient to maintain cell growth, as depletion of BORIS in resistant cells 
ultimately decreased cell viability (Extended Data Fig. 5d, e), which 
indicates a switch from MYCN to BORIS dependency with stable resist- 
ance. This transition was associated with changes in cellular growth 
kinetics—from a highly proliferative, MYCN-overexpressing sensitive 
state to an intermediate, slow-cycling phenotype that was partially 
reversed in fully resistant cells, coincident with overexpression of 
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Fig. 2 | BORIS overexpression is associated with its increased chromatin 
occupancy in resistant cells, whereas CTCF binding is unchanged. 

a, Scatter plot of BORIS binding in sensitive (Sens) and resistant (Res) 
cells for all detected BORIS-binding sites. BORIS peaks unique to resistant 
cells (n = 21,805; 91%), sensitive cells (n = 1,125; 4.7%) and shared 
between the two cell types (n = 1,086; 4.5%) are shown. b, Scatter plot 

of CTCF binding in sensitive and resistant cells for all detected CTCF- 
binding sites. CTCF peaks unique to resistant cells (n = 6,808; 8.3%), 
sensitive cells (n = 19,129; 23.2%) and shared between the two cell types 
(n = 56,438; 68.5%) are shown. c, Overlap between BORIS peaks that 

are unique to resistant cells and CTCF peaks shared between resistant 

and sensitive cells (top), and between resistant cell-specific BORIS peaks 
and sensitive cell-specific CTCF peaks (bottom). d, Meta-analysis of 
average ChIP-seq signals at resistant cell-specific BORIS-binding sites. 
All panels, n = 2 biological replicates. 


BORIS (Extended Data Fig. 5f-h). Given the many sequential steps 
involved in the evolution of resistance, overexpression of BORIS alone 
was not adequate to induce this phenotype (data not shown). Instead, 
concomitant downregulation of MYCN expression and BORIS overex- 
pression in the presence of ALK inhibition were required to generate 
resistance in sensitive cells (Fig. 1g). This combination of factors also 
led to increased expression of the transcription factors that were upreg- 
ulated in the original TAE684-resistant cells, including SOX2 and SOX9 
(Extended Data Figs. 3d, 5i). Thus, resistance to inhibition of ALK in 
neuroblastoma cells evolves through a multistep process that promotes 
a dependency switch from a dominant oncogenic stimulus—amplified 
MYCN—to a phenotypically distinct state characterized by overexpres- 
sion of BORIS. In this context, the resistant cells ultimately become 
dependent on BORIS for survival, which supports a key role for this 
protein in maintenance of the resistance state. 

We next asked whether the aberrant expression of BORIS, a DNA- 
binding protein®, affected its genome-wide occupancy in resistant 
cells. We observed a large (tenfold) gain in BORIS-bound peaks 
after chromatin immunoprecipitation followed by high-throughput 
sequencing (ChIP-seq) analysis in resistant cells: 22,891 versus 2,211 
in sensitive cells (Fig. 2a, Extended Data Fig. 6a, b). By contrast, CTCF 
binding did not change substantially between sensitive and resistant 
cells (75,567 versus 63,246 peaks) (Fig. 2b). A considerable portion 
(n = 17,042; 78%) of the BORIS peaks unique to resistant cells over- 
lapped with CTCF peaks shared by both cell types (Fig. 2c), consistent 
with their heterodimerization”” (Extended Data Fig. 6c). However, 
only a small proportion (n = 1,903; 8.7%) overlapped with CTCF 
peaks unique to sensitive cells, which suggests that BORIS does not 
replace CTCF in resistant cells. BORIS preferentially occupied gene 
regulatory regions—enhancers and promoters (60%)—in resistant cells 
(Extended Data Fig. 6d, e), which is consistent with its propensity to 
bind to open chromatin regions”? (Fig. 2d). Such differential chromatin 
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Fig. 3 | BORIS promotes new chromatin interactions in resistant 

cells. a, DNA interactions gained in resistant cells based on SMC1A 
HiChIP analysis. Interaction classes were determined from the genomic 
locations of the associated anchors (overlapping promoter (Prom) regions 
(transcription start site (TSS) + 2 kb), active enhancer (Enh) regions, or 
CTCE sites only, in that order). Absolute numbers and percentages for 
each loop type (structural (black), regulatory (blue)) are shown. Cartoon 
illustrates the spatial proximity induced by DNA looping between these 
regions. b, Fractions of loops bound by BORIS within each interaction 
class. c, Meta-analysis of average CTCF and BORIS ChIP-seq signals in 
sensitive and resistant cells at the three main anchor types normalized by 
the number of interactions (n = 2 biological replicates). Anchor sites were 
centred and extended in both directions (+ 2 kb). d, ChIP-seq tracks of 
the indicated proteins in sensitive and resistant cells at the BORIS locus 


binding at distinct highly expressed genes in resistant versus sensitive 
cells was commensurate with the MYCN-to-BORIS dependency switch 
(Extended Data Fig. 6f, g). 

The proclivity of aberrantly expressed BORIS for genomic regions 
associated with active chromatin features in resistant cells suggested 
that it may, like CTCF and cohesin, regulate gene expression through 
chromatin looping. Thus, we examined the chromatin looping pro- 
files of sensitive and resistant cells, using cohesin (SMC1A)-based 
high-throughput chromosome conformation capture followed by 
chromatin immunoprecipitation (HiChIP)™ (Extended Data Fig. 7a). 
On the basis of the genomic locations of the associated loop anchors, 
six classes of interactions were identified»: three longer average inter- 
action loops with a CTCF site on at least one anchor; and three smaller 
connecting regulatory regions (Fig. 3a, Extended Data Fig. 7b). The 
overlap of BORIS binding with loop anchors revealed that most (56%) 
of the 9,487 interactions gained in resistant cells were positive for 
BORIS (log)-transformed fold change > 1; false discovery rate (FDR) 
< 0.01) (Fig. 3b, Extended Data Fig. 7c). Notably, BORIS was enriched 
at anchors that were associated with regulatory regions, whereas CTCF 
binding remained constant, as seen at the BORIS locus itself (Fig. 3c, d). 
In fact, BORIS binding alone at CTCF-negative loop anchors was 
sufficient to generate new interactions in resistant cells (Extended Data 
Fig. 7d). 
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(representative of two independent experiments), with resistant cell- 
specific regulatory interactions shown below (HiChIP resistant: paired- 
end tag (PET) numbers, next to each interaction). Signal intensity is given 
in the top left corner for each track. e, PET interactions in BORIS-depleted 
(shBORIS) versus control (shCtrl) cells. f, Resistant cell-specific loops lost 
after depletion of BORIS based on loops negative or positive for BORIS 
binding in shCtrl cells (left), and the odds ratio of losing a loop previously 
bound by BORIS (right). P value determined by two-sided Fisher’s exact 
test. g, Meta-analysis of average BORIS, SMC1A and CTCF ChIP-seq 
signals at resistant cell-specific loop anchors that were lost after depletion 
of BORIS (n = 2 biological replicates). BORIS depletion at loop anchors 
inhibits retention of the cohesin complex, and thus prevents the formation 
of new loops (loop extrusion model). In a, b, e and f, n = 3 biological 
replicates. 


To test whether the newly formed interactions in resistant cells were 
mediated by BORIS binding, we analysed the consequences of BORIS 
depletion on loop architecture (Extended Data Fig. 7e). Regulatory 
interactions specific to resistant cells displayed a global shift towards 
loss after knockdown of BORIS (Fig. 3e), with more than one-quarter 
of the total interactions lost, of which 63% were positive for BORIS at 
their anchors (Fig. 3f). Interactions in which anchors were bound by 
BORIS (especially enhancer-promoter and promoter-promoter inter- 
actions) were more likely to be lost after BORIS depletion than those 
that were not BORIS-bound (Fig. 3f, Extended Data Fig. 7f, g). These 
results agree with the loop extrusion model?®, as BORIS loss resulted in 
decreased SMC1A binding, preferentially at lost interactions, whereas 
CTCF binding did not change significantly (Fig. 3g, Extended Data 
Fig. 7h-j). These data confirm that BORIS is a crucial factor in the 
looping landscape of resistant cells. 

Genes associated with new BORIS-positive regulatory interactions 
were expressed at higher levels than those associated with BORIS- 
negative regulatory interactions or genes not associated with new 
regulatory interactions (Fig. 4a). Because genes that define cell iden- 
tity are often regulated by super-enhancers in both healthy and can- 
cer cells'>?78, we characterized the super-enhancer landscape of our 
cells, observing that the super-enhancers unique to resistant cells were 
enriched at BORIS-positive regulatory loops (Extended Data Fig. 8a—c). 
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Fig. 4 | BORIS-regulated chromatin remodelling supports a 
phenotypic switch that maintains the resistant state. a, Left, fold 

change in expression in counts per million (CPM) of genes involved in 
resistant cell-specific regulatory interactions that are positive for BORIS 
binding (n = 1,368) versus those involved in regulatory interactions 

that are negative for BORIS binding (nm = 519) or not associated with 

a new regulatory interaction (other) (n = 16,151). Centre, fold change 

in expression of genes involved in resistant cell-specific regulatory 
interactions positive for BORIS binding and associated with super- 
enhancers (SEs) specific to resistant cells (n = 134) versus those with 
super-enhancers shared by both cell types (n = 514) or not associated 
with super-enhancers (n = 720). Right, fold change in expression of genes 
involved in resistant cell-specific regulatory interactions positive for 
BORIS binding and associated with resistant cell-specific super-enhancers 
before and after BORIS knockdown (KD) (n = 134) (P values determined 


The presence of such super-enhancers correlated significantly with 
higher expression of their associated genes in resistant versus sensitive 
cells (Fig. 4a). These BORIS-positive super-enhancer-associated 
genes were also enriched for genes that underwent a chromatin state 
switch from a closed or neutral to an open configuration in resistant 
cells (Extended Data Fig. 8d, e). Depletion of BORIS resulted in the 
decreased expression of genes associated with BORIS-positive inter- 
actions, especially genes associated with resistant cell-specific super- 
enhancers (Fig. 4a, Extended Data Fig. 8f). These observations suggest 
that BORIS-mediated alterations in chromatin looping lead to interac- 
tions of newly formed super-enhancers with their target genes, which 
results in their increased expression. 

We next sought to identify BORIS-regulated genes that are function- 
ally linked to the resistance phenotype by integrating gene expression, 
BORIS-mediated looping, super-enhancer landscape and chromatin 
state. This analysis revealed 89 genes (Supplementary Table), including 
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by two-sided Wilcoxon rank-sum test). For all box plots, centre lines 
denote medians; box limits denote twenty-fifth and seventy-fifth 
percentiles; whiskers denote minima and maxima (1.5 the interquartile 
range). b, Highest-ranked transcription factors associated with the 
resistance phenotype selected based on the presence of at least four of 

the five indicated features. c, ChIP-seq tracks of the indicated proteins in 
sensitive and resistant cells at the NEUROG2 locus; regulatory interactions 
with PET numbers indicated below. d, Transcription factor recognition 
motifs at super-enhancers and promoters (+ 2 kb) of the 1,000 highest- 
expressed genes in resistant and sensitive cells (n = 2 biological replicates) 
(P values determined by hypergeometric enrichment test). Panels a-c 
integrate data of biological replicates from expression microarrays (n = 2), 
ChIP-seq (n = 2) and HiChIP (n = 3). e, Proposed role of BORIS in 
resistant cells. 


13 transcription factors, that are highly expressed during early neural 
development and are crucial to cell fate decisions?°??>9 (Fig. 4b, c, 
Extended Data Fig. 8g). The expression of these proneural transcription 
factors paralleled that of BORIS in resistant cells, and was dependent 
on BORIS-mediated looping, as BORIS depletion led to their downreg- 
ulation (Extended Data Fig. 8h, i). Moreover, analysis of transcription 
factor binding sites revealed enrichment of BORIS and several of these 
proneural transcription factors at the regulatory regions of the high- 
est-expressed genes in resistant cells, whereas sensitive cells were domi- 
nated by MYC, MYCN and MAX E-box and E-box-like motifs (Fig. 4d). 
Similar increased expression of proneural transcription factors 
with increased BORIS occupancy at their promoters was seen in 
BORIS-overexpressing E9-resistant SK-N-BE(2) neuroblastoma cells 
compared with their sensitive counterparts (Extended Data Fig. 8), k). 
The high transcriptional activity of these BORIS-regulated genes was 
also associated with increased binding of the transcriptional activator 
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BRD4, which rendered the resistant cells more sensitive to BET inhi- 
bition (Extended Data Fig. 9; Supplementary Note 2). Together, these 
results indicate the establishment of an alternative transcription factor 
regulatory network controlled by BORIS-induced chromatin remod- 
elling to support the resistant cell state. 

Thus, using a pair of isogenic ALK-inhibitor sensitive and resistant 
neuroblastoma cell lines, we show that the CTCF paralogue BORIS 
can promote regulatory DNA interactions that support a phenotypic 
switch in the context of treatment resistance (Fig. 4e). This mechanism 
appears relevant to different neuroblastoma cell lines and kinase inhib- 
itors and may extend to other cancers. In Ewing sarcoma, in which 
overexpression of BORIS is associated with metastasis and relapse 
(Extended Data Fig. 1c), we observed increased BORIS occupancy at 
regulatory regions in chemotherapy-resistant cell lines (Extended Data 
Fig. 10; Supplementary Note 3). Further work will establish whether 
BORIS-mediated alteration of chromatin looping is a general mecha- 
nism by which tumour cells co-opt developmental networks to sustain 
alternative cell states in response to targeted or conventional therapies. 
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METHODS 

Cell lines. Human neuroblastoma cell lines Kelly and SK-N-BE(2) and human 
Ewing sarcoma cell lines TC-32, TC-71 and CHLA-10°!? were obtained from 
the Children’s Oncology Group cell line bank. Human neuroblastoma cell line 
SK-N-SH and human embryonic kidney cell line HEK293T were obtained from 
the American Type Culture Collection. Cell line authenticity was confirmed by 
genotyping, and cells were tested negative for mycoplasma contamination every 
3 months. All cells except HEK293T were grown in RPMI-1640 medium sup- 
plemented with 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin 
(Life Technologies). HEK293T cells were grown in DMEM medium supplemented 
with 10% FBS and 1% penicillin/streptomycin (Life Technologies). Resistant cells 
were grown in the presence of either the ALK inhibitor, TAE684" (Kelly and 
SK-N-SH) or the CDK12 inhibitor, E9*! (SK-N-BE(2)). 

Compounds. TAE684 and E9 were synthesized in-house in the Gray laboratory 
and JQ1°? was obtained from J.Qis laboratory at the Dana-Farber Cancer Institute 
(DFCI). Ceritinib*4, lorlatinib*> and I-BET726*° were purchased from Selleck 
Chemicals. 

Synthetic RNA spike-in and microarray analysis. Total RNA and sample prepara- 
tion was performed as previously described”. In brief, cells were either incubated 
in medium containing DMSO, TAE684 (1 |1M) or JQ1 (2.5 1M), or infected with 
shRNA (Ctrl or BORIS) for 24 h. Cell numbers were determined using a Countess 
II cell counter (Life Technologies) before lysis and RNA extraction. Biological 
duplicates (equivalent to 5 x 10° cells per replicate) were collected and homoge- 
nized in 1 ml of TRIzol Reagent (Ambion), purified using the mirVANA miRNA 
isolation kit (Ambion) following the manufacturer's instructions and re-suspended 
in 50 pl nuclease-free water (Ambion). Total RNA was spiked-in with ERCC RNA 
Spike-In Mix (Ambion), treated with DNA-free DNase I (Ambion) and analysed 
on an Agilent 2100 Bioanalyzer (Agilent Technologies) for integrity. RNA with 
the RNA Integrity Number above 9.8 was hybridized to Affymetrix GeneChip 
PrimeView Human Gene Expression arrays (Affymetrix). 

Antibodies. The following antibodies were used: N-MYC (9405), N-MYC 
(51705), cleaved PARP (9541), cleaved caspase 3 (9661), ALK (3333), AKT (4691), 
pAKT-T308 (9275), pAKT-S473 (#9271), ERK (4695), pERK (4377), S6 (2217), 
pS6 (4857), STAT3 (4904), pSTAT3 (9131), ABCBI (12683), SOX2 (3579), B-actin 
(4967), CTCF (3417), normal rabbit IgG (2729) and HRP anti-mouse IgG (7076) 
from Cell Signaling Technology; HRP anti-rabbit IgG (sc-2357) from Santa Cruz 
Biotechnology; BRD4 (A301-985A100) and SMC1A (A300-055A) from Bethyl 
Laboratories; CTCF (07-729), SOX9 (AB5535) and H3K27me3 (07-449) from 
Millipore; pALK-Y1507 (ab73996), BORIS (ab187163) and H3K27ac (ab2729) 
from Abcam; BORIS (NBP2-52405) from NOVUS Biologicals; BORIS (39851) 
from Active Motif; SIX1 (HPA001893) from Sigma-Aldrich; and Vysis LSI N-MYC 
(2p24) SpectrumGreen/Vysis CEP 2 SpectrumOrange Probe (07J72-001) from 
Abbott. 

Cell viability and growth curve assays. Viability and growth experiments were 
performed using the CellTiter-Glo Luminescent Cell Viability Assay (Promega) 
according to the manufacturer’s instructions, as previously described**. Cells were 
plated in 96-well plates at a seeding density of 4 x 10° cells per well. For growth 
assays, the cells were analysed each day until day 5. For viability, after 24 h, the cells 
were treated with various concentrations of the indicated drug (ranging from 1 nM 
to 10M except for I-BET726: 2 nM to 20 1M). DMSO without drug served as a 
negative control. After 72 h of incubation, cells were analysed for cell viability and 
ICs values were determined using a nonlinear regression curve fit with GraphPad 
Prism 6 software. 

Cell-cycle analysis. Cell-cycle analysis was performed 24 h after cell plating using 
propidium iodide staining, as previously described!>. Cells fixed with 80% ethanol 
overnight at 4°C were resuspended in PBS supplemented with 0.1% Triton X-100 
(Sigma-Aldrich), 25 mg ml! propidium iodide (BD Biosciences) and 0.2 mg ml! 
RNase A (Sigma-Aldrich). After 45 min at 37°C in the dark, analysis was per- 
formed on a FACSCalibur flow cytometer (BD Biosciences). Cell-cycle profiles 
were plotted as histograms generated using FlowJo software (FLOWJO). 
Western blotting. Cell or tumour tissue was lysed in NP-40 buffer (Invitrogen) 
containing a 1 x complete protease inhibitor tablet (Roche) per 10 ml buffer anda 
cocktail of phosphatase inhibitors (Roche). Protein concentration was measured 
using the DC Protein Assay (Bio-Rad); protein (50 j1g) was denatured in LDS sam- 
ple buffer with reducing agent (Invitrogen), separated on precast 4-12% Bis-Tris 
gels (Life Technologies) and transferred to nitrocellulose membranes (Bio-Rad). 
Membranes were incubated in blocking buffer (5% dry milk in TBS with 0.2% 
Tween-20) for 1 h, and then incubated in the primary antibody in blocking buffer 
overnight at 4°C. Chemiluminescent detection was performed with the appropriate 
secondary antibodies and developed using Genemate Blue ultra-autoradiography 
film (VWR). The actin loading controls for the protein samples shown in the 
immunoblots of the following panels (two independent mouse tumour samples, 
and cell lines representative of two independent experiments) are the same because 
the samples were run on a single gel but probed for pALK, ALK (Extended Data 


LETTER 


Fig. 2a), MYCN (Extended Data Fig. 3e) and BORIS (Extended Data Fig. 4a), 
respectively. 

Co-immunoprecipitation. Cells were collected in immunoprecipitation lysis 
buffer (50 mM Tris-HCl buffer (pH 7.4), 100 mM NaCl, 1% Triton-100, 1 mM 
PMSF), containing a 1x complete protease inhibitor tablet (Roche) per 10 ml 
buffer and a cocktail of phosphatase inhibitors (Roche). Homogenates were centri- 
fuged at 20,000g for 10 min at 4°C to obtain supernatants. DNase I (approximately 
1 U ml!) was used to degrade DNA in supernatants by incubation for 1 hat room 
temperature. Co-immunoprecipitation of endogenously expressed proteins was 
performed using protein A Dynabeads (Invitrogen), according to the manufac- 
turer’s instructions. In brief, antibody-conjugated Dynabeads were incubated with 
purified cell lysates to immunoprecipitate the target antigen. Antibodies used for 
immunoprecipitation were CTCF (3417, Cell Signaling Technology) and BORIS 
(NBP2-52405, NOVUS Biologicals). The elution step was conducted by heating 
the beads for 10 min at 95°C in lithium dodecyl sulfate (LDS) sample buffer with 
reducing agent (Invitrogen), after which western blotting was performed using 
the following antibodies: CTCF (3417, Cell Signaling Technology) and BORIS 
(9851, Active Motif). 

Plasmids, shRNA knockdown and overexpression systems. pLKO.1 
shRNA constructs (control: SHC007; MYCN: 1-TRCN0000020694 and 
2-TRCN0000363425; BORIS: 3-TRCN0000370229 and 4-TRCN0000365141; 
BRD4: A-TRCN0000318771 and B-TRCN0000196576) were purchased from 
Sigma-Aldrich and pLKO.1 GFP shRNA was a gift from D. Sabatini (Addgene plas- 
mid 30323)*”. Overexpression constructs were generated by cloning BORIS cDNA 
into the Tet-inducible pInducer20 vector, provided by S. Elledge (Addgene plasmid 
44012)*°. Production of lentiviral particles and subsequent infection were per- 
formed as previously described**. The lentivirus was packaged by co-transfection of 
either pLKO.1 or pInducer20 plasmid with the helper plasmids, pCMV-deltaR8.91 
and pMD2.G-VSV-G into HEK293T cells using TransIT-LT1 Transfection Reagent 
(Mirus Bio LLC). Virus-containing supernatants were collected 48 h after trans- 
fection. Cells were infected with 8 jg ml“! polybrene (Sigma-Aldrich) and 24-48 
h later selected with puromycin (pLKO.1) (Sigma-Aldrich) and then collected at 
appropriate time points. When using the Tet-inducible system for BORIS overex- 
pression, induction of gene expression was achieved by treating cells every 2-3 days 
with doxycycline (0.2 jg ml~’) for a total duration of 37 days. 

qRT-PCR. RNA isolation and PCR amplification were performed as previously 
described*®, except that the RT-PCR was performed using the SuperScript III 
First-Strand system (Life Technologies). Total RNA was isolated from cell lines with 
the RNeasy kit (Qiagen). One microgram of purified RNA was reverse transcribed 
using Superscript III First-Strand (Invitrogen) according to the manufacturer's 
protocol, and quantitative PCR was performed using SYBR Green on a Viia7 Real- 
Time PCR system (Thermo Fisher Scientific). All experiments were performed in 
biological triplicates unless stated otherwise. Each individual biological sample was 
amplified by qPCR in technical replicates and normalized to actin as an internal 
control. Amplification was carried out with primers specific to the genes to be 
quantified (sequences available on request). 

Sequence analysis. The kinase domain of ALK was amplified from cDNA 
extracted from sensitive and resistant cells using the HotStar HiFidelity Polymerase 
Kit (Qiagen). The PCR products were cloned into the pGEM-T vector (Promega) 
and confirmed by sequencing. 

RTK array. The Human Phospho-RTK Array Kit (R&D Systems) was used as 
previously described**. Cell lysate (500 jug) was incubated on a phospho-RTK 
membrane array (ARY001B) according to the manufacturer’s instructions. Target 
proteins were captured with their respective antibodies. After washing, the pro- 
teins were incubated with a phosphotyrosine antibody conjugated to horseradish 
peroxidase to allow the detection of captured phosphorylated RTKs. 
Fluorescent in situ hybridization. Fluorescent in situ hybridization (FISH) anal- 
yses were performed using a Vysis LSI N-MYC (2p24) SpectrumGreen/Vysis CEP 
2 SpectrumOrange Probe (Vysis), in accordance with the manufacturer’s instruc- 
tions. 

Immunohistochemistry. All human tumour specimens (formalin-fixed par- 
affin-embedded slides) were obtained under an Institutional Review Board 
(IRB)-approved protocol of the Dana-Farber/Boston Children's Cancer and 
Blood Disorders Center, and informed consent was obtained from all subjects. 
Staining was performed by Applied Pathology Systems using the ImmPRESS Excel 
Amplified HRP Polymer Staining Kit (MP-7601, Vector Laboratories) on a Dako 
Autostainer (Agilent Technologies). Sections were deparaffinized, rehydrated, and 
subjected to antigen retrieval in citrate-based buffer on a steamer for 25 min. Slides 
were blocked with BLOXALL blocking solution and 2.5% horse serum sequen- 
tially before a 1-h incubation with BORIS antibody at 1:50 dilution (ab187163, 
Abcam). Sections were then incubated with anti-rabbit amplifier antibody and 
ImmPRESS Excel Amplified HRP Polymer Reagent sequentially before incuba- 
tion with ImmPACT DAB EqV Substrate. Finally, slides were counterstained with 
haematoxylin, followed by dehydration and the addition of coverslips. 
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Bisulfite sequencing. Methylation analysis of BORIS (NCBI RefSeq 
NC_000020.11, spanning nucleotides chr20: 57,524,203-57,525,234 on GRCh38.p7 
assembly) was performed using a bisulfite sequencing assay. Genomic DNA (500 ng) 
was treated with the EZ DNA Methylation-Lightning Kit (Zymo Research), 
followed by PCR using ZymoTaq Polymerase premix (Zymo Research) and specific 
primers designed using the Zymo bisulfite primer seeker (http://www.zymore- 
search.com/tools/bisulfite-primer-seeker/; sequences available on request). PCR 
products were then sequenced for the assessment of CpG site-specific DNA meth- 
ylation in the BORIS promoter region. 

Growth assay. After shRNA-mediated knockdown of BORIS, cells were reseeded 
at a density of 4 x 10° cells per well in 6-well plates. At 48 and 120 h of incubation, 
cells were stained with trypan blue (Sigma-Aldrich) and counted on a Countess II 
cell counter (Life Technologies). 

Mouse experiments. All mouse experiments were performed with approval 
from the Institutional Animal Care and Use Committee (IACUC) of the 
DFCI. Three mouse experiments were performed: (i) to assess the tumor- 
igenic potential of resistant cells in vivo; (ii) to assess that resistance to 
TAE684 was maintained in vivo; and (iii) to assess the effect of JQ1 on resist- 
ant cells in vivo. All experiments were performed using subcutaneous cell 
xenograft models generated by injecting 2 x 10° sensitive or resistant Kelly 
neuroblastoma cells into the flanks of NU/NU (Crl:NU-Foxn1") (Charles 
River Laboratories) or NU/NU (CrTac:NCr-Foxn1") (Taconic) 7-week-old 
female mice. Mice were randomized into groups of equal average volumes, 
and investigators were not blinded to group allocation during data collec- 
tion. (i) To assess the tumorigenic potential of resistant cells in absence of 
treatment, mice with established disease (mean tumour volume of 200 mm?) 
were monitored for up to 23 days (m = 4 per group). Tumours were obtained, 
dissociated and used to establish cell lines and for assessment of mRNA 
levels, protein expression and sensitivity to TAE684. (ii) To ensure that 
the in vitro resistance to TAE684 was maintained in vivo, mice with estab- 
lished disease were divided into two cohorts and were treated with either 
TAE684 (10 mg kg~') or vehicle control by oral gavage once daily (n = 8 
per group), and were monitored for up to 56 days from start of treatment. 
(iii) To assess the sensitivity of resistant cells to BRD4 inhibition, mice with 
established disease were divided into two cohorts and treated with either 
JQ1 (50 mg kg~?) or vehicle control intraperitoneally (i.p.) once daily (n = 6 
per group), and were monitored for up to 87 days from start of treatment. 
For all experiments, disease burden was quantified using electronic caliper 
measurements (2-3 times a week) and mouse weights were moni- 
tored at least twice a week. Tumour volumes were calculated using 
the modified ellipsoid formula*!: 4(length x width”). Animals were 
euthanized when tumour volumes reached 1,500-2,000 mm? based 
on institutional IACUC criteria for maximum tumour volumes. In 
none of the experiments were the institutional limits for tumour 
volumes (<2,000 mm? measurement preceding the day of euthanization) 
exceeded. 

ChIP-seq. ChIP was carried out as previously described!° with minor changes as 
described. Approximately 1 x 107 cells were crosslinked for 10 min at room tem- 
perature with 1% formaldehyde (Thermo Scientific) in PBS followed by quenching 
with 0.125 M glycine for 5 min. The cells were then washed twice in ice-cold PBS, 
and the cell pellets flash frozen and stored at —80°C. Fifty microlitres of protein 
G Dynabeads per sample (Invitrogen) were blocked with 0.02% Tween20 (w/v) 
in PBS. Magnetic beads were loaded with 10 j1g each of antibody and incubated 
overnight at 4°C. Crosslinked cells were lysed, placed in sonication buffer with 
0.2% SDS, placed on ice and chromatin was sheared using a Misonix 3000 sonicator 
(Misonix) at the following settings: 10 cycles, each for 30 s on, followed by 1 min 
off, at a power of approximately 20 W. The lysates were then centrifuged for 10 
min at 4°C, supernatants collected and diluted with an equal amount of sonica- 
tion buffer to reach a final concentration of 0.1% SDS. The sonicated lysates were 
incubated overnight at 4°C with the antibody-bound magnetic beads, washed 
with low-salt buffer (50 mM HEPES-KOH (pH 7.5), 0.1% SDS, 1% Triton X-100, 
0.1% sodium deoxycholate, 1 mM EGTA, 1 mM EDTA, 140 mM NaCl and 1x 
complete protease inhibitor), high-salt buffer (50 mM HEPES-KOH (pH 7.5), 0.1% 
SDS, 1% Triton X-100, 0.1% sodium deoxycholate, 1 mM EGTA, 1 mM EDTA, 
500 mM NaCland 1 x complete protease inhibitor), LiCl buffer (20 mM Tris-HCl 
(pH 8), 0.5% NP-40, 0.5% sodium deoxycholate, 1 mM EDTA, 250 mM LiCl and 
1x complete protease inhibitor) and Tris-EDTA buffer. DNA was then eluted in 
elution buffer (50 mM Tris-HCl (pH 8.0), 10 mM EDTA, 1% SDS), and high- 
speed centrifugation was performed to pellet the magnetic beads and collect the 
supernatants. The crosslinking was reversed overnight at 65°C. RNA and protein 
were digested using RNase A and proteinase K, respectively, and DNA was purified 
with phenol chloroform extraction and ethanol precipitation. Purified ChIP DNA 
was used to prepare Illumina multiplexed sequencing libraries using the NEBNext 
Ultra II DNA Library Prep kit and the NEBNext Multiplex Oligos for Illumina 


(New England Biolabs) according to the manufacturer's protocol. Libraries with 
distinct indexes were multiplexed and run together on the Illumina NextSeq 500 
(SY-415-1001, Illumina) for 75 bases in single-read mode. 

HiChIP. HiChIP was performed as previously described™* with a few modifica- 
tions. Approximately 1 x 107 cells were crosslinked for 10 min at room tempera- 
ture with 1% formaldehyde in growth medium and quenched in 0.125 M glycine. 
After washing twice with ice-cold PBS, the supernatant was aspirated and the cell 
pellet flash frozen in liquid nitrogen. Crosslinked cell pellets were thawed on ice, 
resuspended in 1 ml of ice-cold Hi-C lysis buffer (10 mM Tris-HCl (pH 8.0), 10 
mM NaCl, 0.2% NP-40 and 1x complete protease inhibitor) and incubated at 
4°C for 30 min with rotation. Nuclei were pelleted by centrifugation for 5 min 
at 4°C and washed once with 500 j1l of ice-cold Hi-C lysis buffer. After removing 
the supernatant, nuclei were resuspended in 100 1] of 0.5% SDS and incubated at 
62°C for 10 min. SDS was quenched by adding 335 il of 1.5% Triton X-100 and 
incubating for 15 min at 37°C. After the addition of 50 jl of 10x NEB Buffer 
2 (New England Biolabs, B7002) and 375 U of Mbol restriction enzyme (New 
England Biolabs, R0147), chromatin was digested at 37°C for 2 h with rotation. 
After digestion, Mbol enzyme was heat-inactivated by incubating the nuclei at 
62°C for 20 min. To fill in the restriction fragment overhangs and mark the DNA 
ends with biotin, 52 l of fill-in master mix, containing 37.5 jl of 0.4 mM bio- 
tin-dATP (Invitrogen, 19524016), 1.5 jl of 10 mM dCTP (Invitrogen, 18253013), 
1.5 jl of 10 mM dGTP (Invitrogen, 18254011), 1.5 jl of 10 mM dTTP (Invitrogen, 
18255018), and 10 pl of 5 U yl"? DNA Polymerase I, Large (Klenow) Fragment 
(New England Biolabs, M0210), were added and the tubes were incubated at 37 °C 
for 1 h with rotation. Proximity ligation was performed by the addition of 948 jl 
of ligation master mix, containing 150 11 of 10x NEB T4 DNA ligase buffer (New 
England Biolabs, B0202), 125 1 of 10% Triton X-100, 7.5 jul of 20 mg ml! BSA 
(New England Biolabs, B9000), 10 jl of 400 U jl! T4 DNA ligase (New England 
Biolabs, M0202), and 655.5 il of water, and incubation at room temperature for 
4h with rotation. After proximity ligation, nuclei were pelleted by centrifugation 
for 5 min and resuspended in 1 ml of ChIP sonication buffer (50 mM HEPES- 
KOH (pH 7.5), 140 mM NaCl, 1 mM EDTA pH 8.0, 1 mM EGTA (pH 8.0), 1% 
Triton X-100, 0.1% sodium deoxycholate, 0.1% SDS and 1x complete protease 
inhibitor). Nuclei were sonicated using a Misonix 3000 sonicator (Misonix) at the 
following settings: 12 cycles, each for 30 s on, followed by 1 min off, at a power 
of approximately 20 W. Sonicated chromatin was clarified by centrifugation for 
15 min at 4°C and the supernatant was transferred to a tube. Sixty microlitres of 
protein G Dynabeads (Invitrogen) were washed three times and resuspended in 
50 yl sonication buffer. Washed beads were then added to the sonicated chroma- 
tin and incubated for 1 h at 4°C with rotation. Beads were then separated on a 
magnetic stand and the supernatant was transferred to a new tube. Seventy-five 
microlitres of protein G Dynabeads pre-incubated overnight at 4°C with 10 jig of 
anti-SMC1A antibody (Bethyl A300-055A) or 10 jxg of BORIS antibody (Abcam, 
ab187163) were added to the tube and incubated overnight at 4°C with rotation. 
Beads were then separated on a magnetic stand and washed twice with 1 ml of 
sonication buffer, followed by once with 1 ml high-salt sonication buffer (50 mM 
HEPES-KOH (pH 7.5), 500 mM NaCl, 1 mM EDTA pH 8.0, 1 mM EGTA (pH 
8.0), 1% Triton X-100, 0.1% sodium deoxycholate, 0.1% SDS), once with 1 ml of 
LiCl wash buffer (20 mM Tris-HCl (pH 8.0), 1 mM EDTA pH 8.0, 250 mM LiCl, 
0.5% NP-40, 0.5% sodium deoxycholate, 0.1% SDS) and once with 1 ml of TE 
buffer with salt (10 mM Tris-HCl (pH 8.0), 1 mM EDTA pH 8.0, 50 mM NaCl). 
Beads were then resuspended in 200 1l of elution buffer (50 mM Tris-HCl (pH 
8.0), 10 mM EDTA pH 8.0, 1% SDS) and incubated at 65°C for 15 min. To purify 
the eluted DNA, RNA was degraded by the addition of 8.5 1l of 10 mg ml“! RNase 
A and incubation at 37 °C for 2 h. Protein was degraded by the addition of 20 jl 
of 10 mg ml"! proteinase K and incubation at 55°C for 45 min. Samples were 
then incubated at 65°C overnight to reverse crosslink protein-DNA complexes. 
DNA was then purified using Zymo ChIP DNA Clean and Concentrator columns 
(Zymo, D5205) according to the manufacturer's protocol and eluted in 14 iil water. 
The amount of eluted DNA was quantified by Qubit dsDNA HS kit (Invitrogen, 
Q32854). Tagmentation of ChIP DNA was performed using the [lumina Nextera 
DNA Library Prep Kit (Illumina, FC-121-1030). First, 5 jul of MyOne Streptavidin 
C1 Dynabeads (Invitrogen, 65001) was washed with 1 ml of Tween wash buffer 
(5 mM Tris-HCl (pH 7.5), 0.5 mM EDTA (pH 8.0), 1 M NaCl, 0.05% Tween-20) 
and resuspended in 10 iil of 2x biotin binding buffer (10 mM Tris-HCl (pH 7.5), 
1 mM EDTA pH 8.0, 2 M NaC)). Then, 25 ng of purified DNA was added in a total 
volume of 10 jl water to the beads and incubated at room temperature for 15 min 
with agitation every 5 min. After capture, beads were separated with a magnet and 
the supernatant was discarded. Beads were then washed twice with 500 il of Tween 
wash buffer, incubating at 55°C for 2 min with shaking for each wash. Beads were 
resuspended in 25 \1l of Nextera Tagment DNA buffer. To tagment the captured 
DNA, 1 jl of Nextera Tagment DNA Enzyme 1 was added with 24 \1l of Nextera 
Resuspension Buffer and samples were incubated at 55°C for 10 min with shaking. 
Beads were separated on a magnet and supernatant was discarded. Beads were 


washed twice with 500 jl of 50 mM EDTA at 50°C for 30 min, washed twice with 
500 jl of Tween wash buffer at 55°C for 2 min each, and finally washed once with 
500 1l of 10 mM Tris-HCl (pH 7.5) for 1 min at room temperature. Beads were 
separated on a magnet and supernatant was discarded. To generate the sequencing 
library, PCR amplification of the tagmented DNA was performed while the DNA 
was still bound to the beads. Beads were resuspended in 15 1l of Nextera PCR 
Master Mix, 5 jl of Nextera PCR Primer Cocktail, 5 jul of Nextera Index Primer 
1, 5 pl of Nextera Index Primer 2 and 20 jul water. DNA was amplified with 9-10 
cycles of PCR. After PCR, beads were separated on a magnet and the supernatant 
containing the PCR-amplified library was transferred to a new tube, purified using 
Zymo DNA Clean and Concentrator columns (Zymo, D5205) according to the 
manufacturer's protocol, and eluted in 14 il water. Purified HiChIP libraries were 
size-selected to 300-700 bp using a Sage Science Pippin Prep instrument according 
to the manufacturer’s protocol and subjected to 2 x 100 paired-end sequencing 
using an Illumina HiSeq 2500 system (SY-401-2501, Illumina). 

scRNA-seq. Kelly cells (sensitive, intermediate and resistant states) were grown to 
70% confluence in T75 culture flasks. In brief, growth medium was aspirated and 
cells were treated with 0.25% Trypsin/EDTA for 3 min at 37°C, after which cells 
were washed twice with 1 x PBS. Cells were then resuspended into single cells at a 
concentration of 1 x 10° per ml in 1x PBS with 0.4% BSA for 10x Genomics pro- 
cessing. The sorted cell suspensions were loaded onto a 10x Genomics Chromium 
instrument to generate single-cell gel beads in emulsion (GEMs). Approximately 
5,000 cells were loaded per channel. scRNA-seq libraries were prepared using the 
following Single Cell 3’ Reagent Kits: Chromium Single Cell 3’ Library & Gel Bead 
Kit v2 (PN-120237), Single Cell 3’ Chip Kit v2 (PN-120236) and i7 Multiplex Kit 
(PN-120262) (10x Genomics) as previously described”, and following the Single 
Cell 3’ Reagent Kits v2 User Guide (Manual Part CG00052 Rev A). Libraries were 
run onan Illumina HiSeq 4000 system (SY-401-4001, Illumina) as 2 x 150 paired- 
end reads, one full lane per sample, for approximately >90% sequencing saturation. 
Genomics analysis: direct comparison of CTCF and BORIS expression in 
healthy and tumour samples. To assess the expression levels and range of BORIS 
and CTCF in healthy and tumour cells all GTEx, TCGA and TARGET datasets 
were downloaded and converted to FPKM values and displayed as [logo(FPKM + 
1)] (Extended Data Fig. 1a, b). 

Association of BORIS with prognostic features. For each dataset, processed val- 
ues were extracted from the Gene Expression Omnibus (GEO) and scaled values 
were created by normalizing the expression levels by the minimum mean value of 
the conditions that were compared, Esi,j = Ei,j/min(average(Ej)). The two-sided 
Wilcoxon rank-sum test on the original values was used to determine statistical 
differences between the compared conditions (Extended Data Fig. 1c and Extended 
Data Fig. 4f). 

Microarray data analysis. Microarray data were analysed using a custom CDF 
file (GPL16043) that contained the mapping information of the ERCC probes 
used in the spike-in RNAs. The arrays were normalized as previously described*”. 
In brief, all microarray chip data were imported in R (https://www.r-project.org/, 
v.3.1.3) using the affy package’? (v.1.44.0), converted into expression values using 
the expresso command, normalized to take into account the different numbers of 
cells and spike-ins used in the different experiments and renormalized using loess 
regression fitted to the spike-in probes. Sets of differentially expressed genes were 
obtained using the limma package“ (v.3.22.7) and a FDR value of 0.05. Spike-in 
normalized absolute expression values (counts) were normalized to CPM as a 
measurement of relative gene expression concentrations per condition. Total num- 
ber of transcripts per sample was determined as the total number of counts after 
spike-in normalization and the BORIS shRNA sample was first normalized to the 
control shRNA sample to account for technical effects that originated from the 
transfection protocol. 

ChIP-seq analysis. For all ChIP-seq samples, high-quality data were confirmed 
using the Fastqc tool (v.0.11.5) and samples were aligned to the human genome 
(build hg19, GRCh37.75) with STAR (v.2.5.1b_modified) and the parameters ‘“— 
alignIntronMax 1-alignEndsType EndToEnd-outFilterMultimapNmax 1-out- 
FilterMismatchMax 5. Next, non-duplicate reads that mapped to the reference 
chromosomes were retained using Samtools (v.1.3.1) and MarkDuplicates (v.2.1.1) 
from Picard tools. For each experimental replicate, antibody enrichment was 
assessed using the plotFingerprint command from deepTools (v.2.2.4). Peaks 
were identified with MACS2 (v.2.1.1) for narrow peaks (BORIS, CTCE, BRD4, 
Pol2, MYCN) with the parameters ‘“—q 0.01-call-summits’ and for broad peaks 
(H3K27ac, H3K27me3) with the parameters “-broad-cutoff 0.01. Peaks overlap- 
ping regions with known artefact regions (http://mitra.stanford.edu/kundaje/ 
akundaje/release/blacklists/) were blacklisted out. Input normalized bedgraph 
tracks were created with the deepTools command bamCompare and the param- 
eters ‘“-scaleFactorsMethod=readCount-ratio=subtract-binSize=50-number- 
OfProcessors=4—extendReads=200° Subsequently, negative values were set to 
zero and counts were scaled to RPM per bp to account for differences in library 
size. Bigwig files were created with bedGraphToBigWig (v.4). ChIP-seq replicates 
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(n = 2) were merged at the BAM level after assessment of strong correlation with 
the deepTools command ‘multiBigwigSummary BED-file’ using all replicate 
bigwigs and identified peaks as input. Identification of peaks and generation of 
tracks were then repeated for these merged files and used for further analyses. 
Downstream analyses for ChIP-seq and other genomic interval data was per- 
formed in R (https://www.r-project.org/) (v.3.5.1) using the data.table (v.1.12.2) 
package. 

Gencode annotation and isoform selection. Gencode (http://www.gencodegenes. 
org/, release 19) annotation was used and for each gene the most likely isoform 
was selected based on data-driven criteria. In brief, only genes that were part of 
the Refseq transcriptome annotation and with a minimum length of 1 kb were 
considered. Next, isoforms were prioritized according to increased deposition of 
Pol2 and H3K27ac reads on the TSS, transcript length and alphabet rank, in that 
order, until only one transcript was selected for each gene. 

Cell-type-specific binding patterns. To determine the cell specificities of BORIS 
and CTCF peaks, we first combined all peaks identified by MACS2 and merged 
the peak regions that overlapped by at least 50%. A 50% threshold was empiri- 
cally selected to avoid merging peaks that had clear and distinct summits. Next, 
normalized BORIS or CTCF read densities were calculated for each region anda 
ratio [log>(resistant/sensitive)] was calculated. Peak regions with a twofold density 
increase or decrease were classified as resistant or sensitive cell-specific peaks, 
respectively, whereas other regions were denoted as ‘shared’ to indicate that these 
peaks had similar BORIS or CTCF deposition in both cell types (Fig. 2a, b and 
Extended Data Fig. 6a). To explore the proximity of BORIS and CTCF peaks and 
how they were altered during the transition from sensitive to resistant cells, we 
overlapped all shared and cell-type-specific peaks from both cell types in the least 
stringent way (minimum 1-bp overlap) (Fig. 2c and Extended Data Fig. 6a). 
Genomic enrichment of peak-binding sites. To identify genomic locations with 
BORIS or CTCF binding we determined the number of peaks that overlapped 
with at least 25% of known functional regions in the following order: (i) broad 
promoter (+2 kb TSS); (ii) BRD4* H3K27ac* (active) enhancers; (iii) BRD4~ 
H3K27ac* enhancers; (iv) exons; (v) introns; (vi) repressed chromatin represented 
by H3K27me3 broad peaks; or (vii) other (if the peak was outside the aforemen- 
tioned regions) (Extended Data Fig. 6d). Enrichment of ChIP-seq binding at resist- 
ant cell BORIS peaks was performed by extending BORIS summits by 1 kb in both 
directions and calculating the normalized read densities in 50-bp bins (Fig. 2d). 
Genomic enrichment of regulatory regions. To visualize the enrichment of CTCF 
and BORIS at regulatory regions (enhancers and promoters) and the differences 
between sensitive and resistant cells, a metagene analysis for CTCF and BORIS 
occupancies was performed for all H3K27ac enhancer regions and gene promoters. 
All TSSs were extended in both directions by 2 kb and binned in 50-bp bins, and 
each enhancer (start-end) was divided into 40 equally sized bins and extended 
with 2 kb in both directions and these extended regions were binned in 50-bp bins. 
Normalized bedgraph files were used to calculate read density (RPM per bp). An 
aggregated summary profile was created for each cell type. To account for different 
numbers of identified enhancers in both cells types we calculated a normalization 
factor (N resistant enhancers/N sensitive enhancers) to divide each aggregated read 
density (Extended Data Fig. 6e). 

HiChIP processing and quality control. For all SMC1A-based HiChIP datasets, 
raw reads were first trimmed to a uniform length of 50 bp using trimmomat- 
ic(v.0.36) and were then processed using the HiC-Pro (v.2.10.0) pipeline*® with 
default settings for the human genome (build hg19, GRCh37.75) and correspond- 
ing Mbol cut sites. To perform intra- and inter-correlation analysis for biologi- 
cal replicates, forward and reverse reads from the HiC-Pro output were merged 
together to generate one-dimensional SMC1A BAM profiles. Genome-wide 
Spearman correlation in 5-kb bins was computed for all merged genomic anchor 
regions on those merged BAMs for all replicates using the ‘multiBamSummary 
BED-file command from deepTools (Extended Data Fig. 7a, e). 

HiChIP loop calling and differential looping analysis. Loops were directly called 
from the HiC-Pro output using hichipper™ (v.0.7.3), with parameter ‘peaks = com- 
bined, all; and subsequently diffloop”’ (v.1.10.0) with default settings. Only loops 
that were detected in all three biological replicates of a sample (sensitive, resist- 
ant, shCtrl or shBORIS) with a minimum of five paired-end tags in total and an 
FDR < 0.01 were retained for further analysis. To call differential loops between 
samples, the quickAssocVoom function was used and significantly different loops 
were either considered reinforced (mango.FDR < 0.01 and log-transformed fold 
change > 1) or lost (mango.FDR < 0.01 and log-transformed fold change < —1). 
Classification of HiChIP interactions. SMC1A-based HiChIP interactions (loops) 
were classified as previously described** with minor adaptations. Associated 
anchors of loops were overlapped with our ChIP-seq peaks (CTCF, BORIS, 
H3K27ac, BRD4) and promoter regions (TSS + 2 kb), requiring a minimum 
1-bp overlap. Each anchor was then independently classified according to its 
overlap profile, following a hierarchical tree. If an anchor overlapped a pro- 
moter, an enhancer (BRD4 + H3K27ac), or a CTCF peak, it was classified as 
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promoter-, enhancer- or CTCF-anchor, in that order. If there was no overlap, the 
anchor was considered ‘other. By combining these four anchor classes we dis- 
criminated 10 different interaction classes. We excluded from further analyses any 
interaction that contained an anchor classified as other, which also represented on 
average much shorter interactions (data not shown), and which were hence more 
likely to have occurred due to linear proximity on the DNA. This resulted in the 
identification of 6 main interaction classes (Fig. 3a and Extended Data Fig. 7b). 
Association of BORIS with lost loops. Only loops that were detected in both the 
original (sensitive versus resistant) and BORIS depletion (shBORIS versus shCtrl) 
samples were used for this analysis. First, loops were divided into lost and retained 
loops upon BORIS depletion, and an odds ratio (two-sided Fisher's exact test) was 
calculated for the initial presence of BORIS binding on the anchors of these two 
groups (Fig. 3f). An analogous strategy was followed after first stratifying loops 
according to the different identified loop classes (Extended Data Fig. 7f, g). 
Identification of super-enhancer regions. Super-enhancers were identified using 
the ROSE algorithm (v.1) (https://bitbucket.org/young_computation/rose). In 
short, H3K27ac enriched regions were identified with MACS2 and termed typical 
enhancers. These regions were stitched together if they were within 12.5 kb of each 
other. Stitched regions were ranked by H3K27ac signal therein and the inclination 
point at which the two classes of enhancers separated was determined by ROSE. 
Stitched enhancers above this threshold were considered super-enhancers and 
the others, typical enhancers. To compare different samples, we used the same 
maximum threshold between the conditions considered (Extended Data Fig. 8a). 
Identification of cell-type-specific super-enhancers. Cell-type-specific and active 
super-enhancers were identified by merging both sensitive- and resistant-cell 
super-enhancers and determining cell-type specificity based on the differen- 
tial normalized read density of both H3K27ac and BRD4. In brief, ratios [log> 
(resistant/sensitive)] were calculated for H3K27ac and BRD4. A combined 
threshold of 2.5 was required to identify a cell-type-specific super-enhancer with 
at least a minimum 0.75 change for each individual mark. Super-enhancers that 
did not meet these criteria were classed as shared (neutral) between cell types 
(Extended Data Fig. 8b). 

Correlation analysis of looping with gene expression and enhancer landscape. 
Regulatory interactions were associated to target genes and super-enhancers based 
on proximity to the TSS and minimal overlap (1 bp) with its anchors, respectively 
(Fig. 4a and Extended Data Fig. 8f). 

Chromatin-based gene classification. Genes were classified as having an ‘open, 
‘neutral or ‘closed’ chromatin state based on unsupervised clustering of a metagene 
representation of ChIP-seq occupancy of H3K27ac and H3K27me3. Each gene 
(from TSS to TES, and 2 kb up- and downstream of this region) was divided 
into 20 equally sized bins; the extended regions were binned in regions of 50 bp. 
Normalized bedgraph files were used to calculate read density (RPM per bp) and 
k-means clustering was applied to group each extended gene region in one of three 
clusters (Extended Data Fig. 8d, e). An aggregated summary profile was created 
for each group of genes. The open and closed clusters were classified based on pre- 
dominantly H3K27ac and H3K27me3 accumulation, respectively, and the ‘neutral’ 
cluster displayed on average equal levels of both. 

Integrated genomic data analysis. An ensemble analysis was performed to iden- 
tify the set of genes that showed characteristics of reactivation in resistant cells. 
For each gene, five features were examined: (i) creation of a unique regulatory 
interaction; (ii) deposition of BORIS on its promoter or looped enhancer; (iii) 
association with a resistant cell-specific super-enhancer through overlap with 
either its promoter or looped anchor; (iv) increased mRNA expression; and (v) 
transition from a closed or neutral state to an open chromatin state. A unique set 
of 89 genes (Supplementary Table) that exhibited four out of five features were 
identified as the top reactivated genes in resistant cells. Within these 89 genes, 13 
were identified as transcription factors by the TcoF database (http://www.cbre. 
kaust.edu.sa/tcof/) (Fig. 4b). 

Allen Brain atlas gene signature. Expression data and metadata for human brain 
development were downloaded from the Allen Brain atlas (http://www.brainspan. 
org). Row-normalized z-scores of [log2(RPKM + 1)] values were used to create 
a heat map. Values greater than 3.5 were set to 3.5 to reduce the effect of extreme 
outliers on the visualization. Samples were ordered according to developmental 
time points (Extended Data Fig. 8g). 

BORIS and BRD4 correlation at promoter regions. BORIS and BRD4 colocali- 
zation and correlation were assessed for the promoter regions of the 89 top-ranked 
genes. The TSS was extended in both directions by 2 kb and binned in 100-bp 
regions. Normalized read densities for BORIS and BRD4 were calculated and a 
Spearman's rank correlation coefficient calculated for sensitive and resistant cells. 
An aggregated density plot of all 89 genes was created to visualize the increased 
deposition and correlation of BRD4 and BORIS in resistant cells (Extended Data 
Fig. 9a). 

Gene expression and DNA-binding analysis. To examine the association between 
gene expression and overlapping targets of MYCN and BORIS in sensitive and 


resistant cells, respectively, the percentage of gene promoters (+2 kb TSS) that 
overlapped with ChIP-seq peaks in 10 equally sized bins based on the expression 
distribution was calculated (Extended Data Fig. 6f). To visualize and correlate 
gene expression with DNA binding of MYCN or BORIS, genes were ranked based 
on expression and plotted against the total rescaled (0-100) binding intensities 
calculated for each gene promoter (+2 kb TSS). For each ChIP-seq mark a loess 
regression curve was computed using a span of 0.1 (Extended Data Fig. 6g). 
Transcription factor motif enrichment analysis. Statistically overrepresented 
motifs were identified with HOMER”* (v.2) using the command findMotifs.pl 
providing both target and background fasta sequences for regions of interest. For 
promoter regions we selected the top 1,000 up- and downregulated genes in resist- 
ant cells and extended the TSS of each gene by 2 kb in both directions. The genomic 
coordinates were used to extract fasta sequences with the Biostrings package 
(v.2.50.1) in R and used as target or background to identify motifs associated with 
promoter regions of genes within each cell type. A similar strategy was followed to 
identify overrepresented motifs associated with cell-type-specific super-enhancers. 
Target and background fasta sequences were extracted from the summits of BRD4 
peaks located on cell-type-specific super-enhancers and extended by 500 bp in 
both directions. For a selection of enriched sequences, the associated transcription 
factor motif and significance level (P) was visualized using a heat map (Fig. 4d). 
scRNA-seq analysis. The Cell Ranger Single Cell Software Suite, v.1.3 was used to 
perform sample de-multiplexing, barcode and unique molecular identifier (UMI) 
processing, and single-cell 3’ gene counting. A detailed description of the pipeline 
and specific instructions to run it can be found at: https://support.10xgenomics. 
com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger. A 
high-quality gene expression matrix was created in sequential preprocessing steps. 
First UMI-based counts were converted to relative expression concentrations by 
rescaling each cell to a library size of 10,000. Genes were considered detected if 
rescaled count > log(0.1 + 1) and retained for further analysis if present in at least 
0.5% of the cells from the sample with the lowest cell count. Cells were removed if 
fewer than 1,000 genes were detected. To remove low-quality cells, we calculated 
five technical indicators (ratio of detected genes/UMI, percentage of mitochondrial 
genes, percentage of ribosomal genes, average GC content of library and library 
complexity measured by Shannon Entropy) and performed PCA on indicators 
with a coefficient of variation > 5%. Next, density-based clustering was performed 
on the first and second principal component using an epsilon determined by a 
k-nearest neighbour plot. All cells that were located outside the main cluster were 
considered low quality and removed from further analysis. Next, we used the R 
package ‘scater’ (v.1.10.0) to confirm that there were no technical or experimental 
confounding effects and the R package ‘Seurat’ (v.2.3.4) to analyse and visualize the 
data. In brief, UMI counts were log-normalized with a scale factor of 10,000 and 
subsequently centre-scaled. To visualize cells in a reduced dimensionality, PCA 
was performed on the most variable genes, which were identified as genes with 
higher-than-expected variability in consecutive ranked expression bins. Higher 
complexity clustering was performed with ¢-SNE using the first 10 principal com- 
ponents, which were deemed most informative based on heat map and elbow plot 
observation. To identify homogeneous subpopulations, we performed iterative 
clustering using the network-based clustering algorithm (shared nearest neigh- 
bour) with different resolutions as input until each sample was at least separated 
in two groups. A simple pseudotime analysis was performed by calculating an 
average expression profile for each identified subpopulation and ordering them 
according to the summarized expression of transcription factors that displayed var- 
iable expression between sensitive and intermediate or intermediate and resistant 
cells. Variable expression was defined as showing at least a 33% change in the rank 
of expression between two samples with a minimal normalized expression level 
> 0.2. For each sample comparison, at least the top 10 most variable transcrip- 
tion factors were included. In total this resulted in 32 transcription factors. Gene 
expression values were then linearly rescaled between 0 and 10 to jointly visualize 
relative expression changes during this pseudotime. To examine co-detection or 
mutual exclusivity between genes of interest, a two-sided Fisher’s exact test was 
performed for all cells in a given sample. A score combining both the odds ratio 
and the -log;9(P value) was calculated to visualize both the strength and direction 
between genes in pairwise co-expression tests. 

Statistical analysis. Analysis for each plot is listed in the figure legend and/or in 
the corresponding Methods. In brief, all grouped data are presented as mean + s.d. 
unless stated otherwise. All box and whisker plots of expression data are presented 
as: centre lines, medians; box limits, twenty-fifth and seventy-fifth percentiles; 
whiskers, minima and maxima (1.5 x the interquartile range). Statistical signif- 
icance for pairwise comparisons was determined using the two-sided Wilcoxon 
rank-sum test or two-sided unpaired t-test, unless stated otherwise. Survival analy- 
sis was performed using the Kaplan-Meier method and differences between groups 
calculated by the two-sided log-rank test and the Bonferroni correction method. 
Tumour volume comparisons for the xenograft studies were analysed by Mann- 
Whitney U test. *P < 0.05; **P < 0.01. Statistical comparisons of distributions of 


fold changes for the expression microarrays were done using the Mann-Whitney 
Utest. All quantitative analyses are expressed as the mean + s.d. of three biological 
replicates, unless stated otherwise. Microarray and ChIP-seq data are based on at 
least two independent experiments. For all experiments, no statistical methods 
were used to predetermine sample size. Unless stated otherwise, experiments were 
not randomized and investigators were not blinded to allocation during experi- 
ments and outcome assessment. 

Track visualizations. Peaks, (super-) enhancers and HiChIP interactions were 
visualized with a custom build tool (github.com/RubD/GeTrackViz2) or with the 
circlize package (v.0.4.5) in R. 

Retrospective analysis of gene expression in human samples. Gene expression 
levels or correlations across primary tumours, healthy tissues or experimental data 
and patient survival were determined through analysis of the TCGA and TARGET 
(https://cancergenome.nih.gov/), GTEx (https://www.gtexportal.org/home/), R2 
(https://hgserver1.amc.nl/cgi-bin/r2/main.cgi), Allen Brain atlas (http://www. 
brain-map.org/) and selected datasets representing distinct tumour types with poor 
prognosis feature annotations (GSE49710 (Neuroblastoma)°°, GSE17679 (Mixed 
Ewing Sarcoma)*!, GSE63074 (Non-small cell lung carcinoma)”, GSE15709 
(ovarian cancer)°*, GSE16179 (breast cancer)*4 and GSE7181 (Glioblastoma)*’). 
Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 
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Extended Data Fig. 1 | BORIS is expressed in several cancers and 
associated with high-risk features. a, b, Relative mRNA expression 
[log2(FPKM + 1)] of CTCF and BORIS in normal tissues (a) and in 
various cancer types based on TCGA datasets (b). FPKM, fragments per 
kilobase of transcript per million mapped reads. Keys to cancer types: 
ACC, adrenocortical carcinoma; AML, acute myeloid leukaemia; BLCA, 
bladder urothelial carcinoma; BRCA, breast invasive carcinoma; CESC, 
cervical squamous cell carcinoma and endocervical adenocarcinoma; 
CHOL, cholangiocarcinoma; COAD, colon adenocarcinoma; DLBC, 
diffuse large B-cell lymphoma; ESCA, oesophageal carcinoma; GBM, 
glioblastoma multiforme; HNSC, head and neck squamous cell 
carcinoma; LGG, low-grade glioma; KICH, kidney chromophobe; 
KIRC, renal clear cell carcinoma; KIRP, kidney renal papillary cell 
carcinoma; LAML, acute myeloid leukaemia; LIHC, hepatocellular 
carcinoma; LUAD, lung adenocarcinoma; LUSC, lung squamous cell 
carcinoma; MESO, mesothelioma; NB, neuroblastoma; OV, serous 
ovarian cystadenocarcinoma; PAAD, pancreatic adenocarcinoma; PCPG, 


pheochromocytoma and paraganglioma; PRAD, prostate adenocarcinoma; 
READ, rectum adenocarcinoma; RT, rhabdoid tumour; SARC, sarcoma; 
SKCM, skin cutaneous melanoma; STAD, stomach adenocarcinoma; 
TGCT, testicular germ cell tumour; THCA, thyroid carcinoma; THYM, 
thymoma; UCEC, uterine corpus endometrial carcinoma; UCS, uterine 
carcinosarcoma; UVM, uveal melanoma; WT, Wilms tumour. c, Box plots 
showing the correlation of BORIS expression with risk status, tumour 
stage (primary versus metastasis/recurrence), presence of cancer stem 
cells (CD133 positivity) and response to targeted (lapatinib) or cytotoxic 
(cisplatin) therapy in the tumour types depicted. NSCLC, non-small cell 
lung cancer. Datasets (Mixed Ewing Sarcoma-Savola-117 and NSCLC- 
Plamadeala-410) were extracted from the R2: Genomics Analysis and 
Visualization Platform (http://r2.amc.nl). GSE7181 (glioblastoma); 
GSE16179 (breast cancer); GSE15372 (ovarian cancer). P values 
determined by two-sided Wilcoxon rank-sum test. For all panels, sample 
sizes (n) are depicted in parenthesis and box plots are as defined in Fig. 4. 
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Extended Data Fig. 2 | ALK inhibitor-resistant cells exhibit stable 
resistance in vivo and no longer rely on ALK signalling. a, Left, tumour 
volumes of sensitive and resistant cell xenografts in untreated NU/NU 
(Crl:NU-Foxn1") mice established by subcutaneous injection of 

2 x 10° cells into both flanks. Animals were euthanized when tumours 
reached 1,500-2,000 mm*. Data are mean + s.e.m.,n =4 per arm. Right, 
immunoblot analysis of total and phosphorylated ALK in TAE-resistant 
xenograft tumours (1 and 2) and sensitive and resistant cells in culture. 

b, Dose-response curves for TAE684 in sensitive and resistant cell lines 
established from the same tumour xenografts as in a (ICso values: sensitive, 
7.9 nM; resistant, 878.6 nM). Data are mean + s.d., n = 3 biological 
replicates. c, Tumour volumes (left) and Kaplan-Meier survival curves 
(right) of resistant cell xenografts in NU/NU (CrTac:NCr-Foxn1") mice 
treated with TAE684 (10 mg kg” by oral gavage once daily) or vehicle 
control for up to 56 days. Data are mean + s.e.m., n = 8 per arm. P values 
determined by Mann-Whitney U test for tumour volumes (P = 0.8404) 
and by log-rank test for Kaplan-Meier survival analysis (P = 0.8076), both 
two-sided. d, Dose-response curves for TAE684-sensitive and -resistant 


cells treated with ceritinib (ICs values: sensitive, 33.8 nM; resistant, 
446.5 nM) or lorlatinib (ICs, values: sensitive, 47.5 nM; resistant, 2,318 
nM). Data are mean + s.d., n = 3 biological replicates. e, Immunoblot 
analysis of the indicated proteins in sensitive and resistant cells treated 
with DMSO or 1 1M TAE684 for 6 or 24 h. f, Electropherograms of ALK 
kinase domain sequencing in sensitive and resistant cells. Arrows show the 
F1174L mutation characteristic of Kelly cells. HEK293T cells were used 
as a control for sequencing wild-type ALK. g, Phosphoproteomic analysis 
of a panel of receptor tyrosine kinases (RTKs) in sensitive and resistant 
cells. Each RTK is shown in duplicate and the pairs in the corners of each 
array are positive controls. Numbered RTKs with corresponding names 
listed on the right represent the highest-phosphorylated proteins. ALK 

is depicted in red. h, Quantitative reverse transcription PCR (qRT-PCR) 
and immunoblot analysis of ABCB1 and ABCG2 multidrug transporter 
expression in sensitive and resistant cells. The qRT-PCR data are means 
of n = 2 biological replicates. In a (immunoblot), d, f and g, data are 
representative of two independent experiments (see Supplementary 

Note 1 for details; for gel source data, see Supplementary Fig. 1). 
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Extended Data Fig. 3 | See next page for caption. 
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Extended Data Fig. 3 | Development of resistance is associated with loss 
of MYCN followed by gradual induction of proneural transcription 
factors. a, TAE684 dose-response curves of Kelly neuroblastoma 

cells during resistance establishment (ICsp values: sensitive, 39.4 nM; 
intermediate, 618 nM; resistant, 1,739 nM). Data are mean + s.d., 

n = 3 biological replicates. Schematic representation of development of 
resistance is shown above. b, t-SNE plot of scRNA-seq data showing the 
segregation of sensitive (n = 5,432), intermediate (n = 6,376) and resistant 
(n = 6,379) cells. c, t-SNE plot depicting unsupervised clusters for the 
individual subpopulations that underlie the pseudotime analysis. 

d, Heat map of rescaled gene expression values of the most variable ranked 
transcription factors in the three cell states. e, RT-PCR and immunoblot 
analysis of MYCN expression in TAE684-resistant xenograft tumours 

(1 and 2) and sensitive and resistant cells in culture. The RT-PCR data 
are mean + s.d., n = 4 biological replicates for sensitive and resistant cells 
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(#**P = 1.396 x 107"; unpaired two-sided t-test) and n = 3 technical 
replicates for each tumour. f, Fluorescence in situ hybridization of MYCN 
in sensitive and resistant cells (representative of 20 nuclei per condition). 
g, ChIP-seq track of H3K27me3 binding at the MYCN locus in sensitive 
and resistant cells. Signal intensity is given in the top right corner. h, Line 
plot showing the association between genes ordered by expression (x axis) 
and changes in absolute gene expression levels (y axis) in sensitive versus 
resistant cells. Bar plot, total transcriptional yield in sensitive or resistant 
cells. i, Immunoblot analysis of the indicated proteins in sensitive and 
resistant cells expressing control (shCtrl) or MYCN (shMYCN-1 and -2) 
shRNAs. j, Violin plots representing the expression distribution of selected 
genes in the same cells as in a (centre line, median). k, Bar plot showing 
the fractions of cells with detectable mRNA levels of the same genes as in 
d. In e (immunoblot) and f-i, data are representative of two independent 
experiments (for gel source data, see Supplementary Fig. 1). 
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Extended Data Fig. 4 | Overexpression of BORIS is seen in resistance 
models of neuroblastoma and correlates with high-risk disease and 

a poor outcome. a, RT-PCR and immunoblot analysis of BORIS 
expression in TAE684-resistant Kelly cell xenograft tumours (1 and 2) and 
sensitive and resistant cells in culture. The qRT-PCR data are mean + s.d., 
n = 4 biological replicates for sensitive and resistant cells (**P = 0.0014; 
unpaired two-sided t-test) and n = 3 technical replicates for each tumour. 
b, Bisulfite sequencing of the BORIS promoter in sensitive and resistant 
cells. Black circles represent methylated cytosine residues in a CpG 
dinucleotide, empty circles are unmethylated cytosines. The B and C TSSs 
are indicated by arrows. c, Dose-response curves to TAE684 (left) and 
immunoblot analysis of BORIS expression (right) in TAE684-sensitive and 
-resistant SK-N-SH neuroblastoma cells (IC59 values: sensitive, 47.9 nM; 
resistant, 1,739 nM). d, Dose-response curves to the CDK12 inhibitor, E9 
(left) and immunoblot analysis of BORIS expression (right) in sensitive 


Follow up in months 


and resistant SK-N-BE(2) neuroblastoma cells (ICs9 values: sensitive, 

9.5 nM; resistant, 638 nM). Data are mean + s.d., n = 3 biological 
replicates for c (left) and d (left). e, Immunohistochemical staining 

of BORIS expression in primary neuroblastoma tumour samples 
(representative of four independent experiments). Scale bar, 20 um. 

f, Box plots showing correlation of BORIS expression with the indicated 
parameters in a human neuroblastoma dataset (n = 498; Tumour 
Neuroblastoma-SEQC-498; R2: Genomics Analysis and Visualization 
Platform (http://r2.amc.nl)). Box plots are as defined in Fig. 4. P values 
were determined by two-sided Wilcoxon rank-sum test. g, Kaplan-Meier 
analysis of overall survival based on BORIS expression in the same dataset 
as in f (n = 498; two-sided log-rank test with Bonferroni correction). In 
a, c, d (immunoblots) and b, data are representative of two independent 
experiments. Sample sizes (1) are depicted in parenthesis for f and g 

(for gel source data, see Supplementary Fig. 1). 
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Extended Data Fig. 5 | Resistant cells are dependent on BORIS for 
survival. a, Dose-response curves to TAE684 in resistant cells expressing 
control (shCtrl) or BORIS (shBORIS) shRNAs (ICs9 values: shCtrl, 

537.7 nM; shBORIS, 141.2 nM). Data are mean + s.d., n = 3 biological 
replicates. b, Heat map of gene expression values in the same cells as in a 
(n = 2 biological replicates). Rows are z-scores calculated for each gene 
in both conditions. c, Immunoblot analysis of the indicated proteins in 
the same cells as in a. d, e, Immunoblot analysis of the indicated proteins 
(CL., cleaved; CC3, cleaved caspase 3) (d), and quantification of trypan 
blue staining (e) in sensitive and resistant cells expressing control (shCtrl) 
or BORIS (shBORIS-3 and -4) shRNAs. Data are mean + s.d., n = 3 
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biological replicates (*P < 0.05; **P < 0.01; ***P < 0.001; unpaired 
two-sided t-tests). f-h, Phase-contrast microscopy images (scale bars, 
150 jum) (f), growth curves (g) and flow cytometry analyses (h) of 
propidium iodide (PI) staining in sensitive, intermediate and resistant 
cells. Data are mean + s.d., n = 3 biological replicates (***P < 0.0001 for 
all comparisons; two-way ANOVA). i, RT-PCR analysis of the expression 
of the indicated proneural transcription factors in the same sensitive 
(DMSO) versus MYCN®? and BORIS™ (DOX + TAE) cells as in Fig. 1g. 
Data are mean + SD, n = 3 biological replicates (*P < 0.05; **P < 0.01; 
unpaired two-sided t-tests). In c, d, f and h, data are representative of two 
independent experiments (for gel source data, see Supplementary Fig. 1). 
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Extended Data Fig. 6 | See next page for caption. 
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Extended Data Fig. 6 | BORIS colocalizes with CTCF and open 
chromatin. a, Bar graphs illustrating the overlap of shared and specific 
BORIS and CTCF-binding sites in sensitive and resistant cells. Most 
resistant cell-specific BORIS peaks (red) colocalize with CTCF peaks 
that are shared between the two cell types. The markedly lower number 
of BORIS peaks that are unique to sensitive cells (green) or shared 
between sensitive and resistant cells (grey) typically do not overlap with 
CTCF peaks that are shared or specific to any cell type (top). Most CTCF 
peaks are shared (grey) between sensitive and resistant cells and either 
do not overlap with BORIS peaks, or overlap only with those restricted 
to resistant cells (bottom). b, Comparison of CTCF and BORIS peaks 
identified in sensitive and resistant cells. c, Co-immunoprecipitation of 
BORIS with CTCF in sensitive and resistant cells (representative of two 
independent experiments). IgG and sample without antibody (Ab) serve 
as controls. d, Pie charts depicting the percentages of genomic regions 
bound by BORIS in sensitive (top) and resistant (bottom) cells. Numbers 
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of BORIS-binding peaks in each cell type are given below each pie 

chart. The regions shown are promoters (TSS + 2 kb), typical enhancers 
(H3K27ac), active enhancers (H3K27ac + BRD4), repressed chromatin 
(H3K27me3), exons, introns, and other (peaks not assigned to any of 

the previous categories). e, Meta-analysis of average CTCF and BORIS 
ChIP-seq signals in RPM per bp at enhancer and TSS regions in sensitive 
and resistant cells. f, Percentage of gene promoters bound by BORIS in 
sensitive (black) and resistant (red) cells for 10 equal-sized groups ordered 
based on absolute gene expression levels in resistant cells. Percentage of 
promoters bound by BORIS in resistant cells that were also originally 
bound by MYCN in sensitive cells is shown in grey. g, Loess regression 
analysis of ranked gene expression against BORIS and MYCN occupancies 
at gene promoters in sensitive and resistant cells. Shaded regions represent 
95% confidence intervals. All panels except c depict data from n = 2 
biological replicates (for gel source data, see Supplementary Fig. 1). 
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Extended Data Fig. 7 | See next page for caption. 


Extended Data Fig. 7 | Regulatory loops in resistant cells are more 
vulnerable to BORIS depletion. a, Heat map depicting the Spearman 
correlation between HiChIP biological replicates of sensitive and resistant 
cells in genome-wide bins of 5 kb for all merged anchor regions. b, Box 
plots showing the genomic length distribution (in log,(bp)) for interaction 
classes that are specific to resistant cells. c, Table depicting HiChIP loop 
class statistics in resistant cells, including their association with BORIS 
binding. d, ChIP-seq tracks of the indicated proteins in sensitive and 
resistant cells at the TCP11L2 locus (representative of two independent 
experiments), with resistant cell-specific regulatory interactions shown 
below (HiChIP Res: PET numbers, next to each interaction). Signal 
intensity is given in the top left corner for each track. e, Heat map 
depicting the Spearman correlation between HiChIP biological replicates 
of sensitive, resistant, shCtrl and shBORIS cells in genome-wide bins of 

5 kb for all merged anchor regions. f, Bar plots showing the number and 


LETTER 


fraction of resistant cell-specific loops for all interaction classes that 
were BORIS negative and positive in resistant cells, and that were lost 
after BORIS depletion. g, Bar plots showing the odds ratio (two-sided 
Fisher’s exact test) of losing loops that were previously bound by BORIS 
for all interaction classes. h, Box plots showing the initial intensities 

(in normalized read counts) of BORIS and SMC1A binding in the shRNA 
control cells at the anchors of the resistant cell-specific loops that were 
significantly lost versus those that were retained in shBORIS cells 
(two-sided Wilcoxon rank-sum test). i, Box plot showing the difference 
in SMC1A loss (shBORIS versus shCtrl) on the same anchors as in h. P 
value determined by two-sided Wilcoxon rank-sum test. All box plots 
are as defined in Fig. 4. j, Metaplots depicting BORIS, SMC1A and CTCF 
binding at the anchors of the resistant cell-specific loops that were lost 

or retained after BORIS depletion. In a-c and e-g, data are from n = 3 
biological replicates. In h-j, data are from n = 2 biological replicates. 


LETTER 


a b c 
neutral gS 2 | 
Res specific 3 nae i Boris” 
Sens Res 254 Sens specific - ° Hi BORISt 
x = - 
— P=2.2e-4 
s Cutoff used: 1,517 8 Cutoff used: 1,517 + By 1.00 = 
Bq SE identified: 1,094 SE identified: 1,152 a + . 
= @ a 0.0 é 
3 oa & Z Wi 0.75 
<2 s ge 3 
ge . ny B -2.5 Beye 800 oa 
L 3 ° j 3 
a 00 ® 0.25 
. & 
o S ° é 200 pao 
° * 
0 10k 20k 30k 0 10k 20k 30k 5.0) 0 | : gy & 
Enhancers Enhancers -5 - e 
logy (Res H3K27ac + 1) / ¢ of 
d ‘ (Sens H3K27ac + 1) 
Open Neutral Closed Sens Res 
100k Ml H3K27ac am 
2 75k Mi H3K27me3 Open 2 & 
i= — = 
& 50k a 
= a 
i 25k Neutral a O 
na 
= — 2s 
0 ee = 
es a Closed 
g h 
Expression levels from early to late brain development i a 
Hil)! | i My in} 10) JUN NANNY, ASCL2 NERO 
| ALN SN oy BOR'S NHLH1 
\ I l \ al EUR NEUROGS 
WA l /NEUROD! NEUROG2 i 
EUROD4 
NEUROG2 
‘50 ih |) al ===s—_ 
il 1 wie SOx? soxo fis Som 
| SOX9 10 er SOx2 
lI ii ZFP36L1 U 
O40 - ene] crc 
prenatal (pcw) age (years) MYCN || . i 
Relative expression 
odds-ratio > > ¥ S » FY 
OG OO GO 
-4-20 2 4 SLE EEE 
S§ §§ 
j k 2s < s 
siueod = Sens SK-N-BE(2) m Sens SK-N-BE(2) eas eine 
oS » EQ Res SK-N-BE(2) w E9 Res SK-N-BE(2) Sens Res 
= 10,000 * 
o c 7 
2 4,000 8 
3 es 
£ 100 33 
S o= 
Q 10 ns 
> =LS 
é ir 
3 : 9 
& a 
0.1 


) sv 
$3 So om Eros, Ke 
SKS 
eS 
Extended Data Fig. 8 | See next page for caption. 


Extended Data Fig. 8 | Redistribution of the super-enhancer landscape 
with subsequent expression of a BORIS-dependent proneural network 
in resistant cells. a, Accumulation of H3K27ac signal at enhancer 
regions. Typical enhancers (grey) are plotted according to increasing 
levels of normalized H3K27ac signal (length x density) in sensitive and 
resistant cells. The highest cut-off based on the inclination point in both 
sensitive and resistant cells was used to delineate super-enhancers (red). 
b, Scatter plot showing differential binding of H3K27ac [logo(RPM per bp 
+ 1)] and BRD4 [log»(RPM per bp + 1)] for all detected super-enhancers 
in both sensitive and resistant cells. Cell-specific super-enhancers were 
identified based on the combined increase in H3K27ac and BRD4 
binding. For each individual histone mark, a 0.75 logy-transformed 

fold change threshold was applied and a minimum summed 2.5 log»- 
transformed fold change was used as the final cut-off. c, Bar plot depicting 
the enrichment (two-sided Fisher’s exact test) and fractions of resistant 
cell-specific and shared super-enhancers that were located at resistant 
cell-specific regulatory loop anchors in resistant cells. d, Density plots 
showing the aggregated accumulation of H3K27ac and H3K27me3 at gene 
regions, defined as 2 kb upstream of the TSS and 2 kb downstream of the 
transcription end site (TES). k-means clustering (k = 3) analysis resulted 
in the separation of genes associated with ‘open, ‘neutral’ or ‘closed’ 
chromatin in both sensitive and resistant cells. e, Sankey diagram of the 
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distribution of genes in distinct chromatin states and the switches between 
sensitive and resistant cells. f, Box plots showing the expression level 
changes upon BORIS depletion for genes that had a resistant cell-specific 
and BORIS-positive regulatory interaction and were not associated with 

a super-enhancer (n = 720), associated with a super-enhancer in both 

cell types (n = 514) or associated with a super-enhancer seen only in the 
resistant cells (n = 134) (two-sided Wilcoxon rank-sum test). Box plots 
are as defined in Fig. 4. g, Heat map of the expression levels of the indicated 
proneural transcription factor genes during brain development 
(http://www.brain-map.org/). Gene expression levels are represented as 
z-scores for different developmental time points (n = 413; pew, post- 
conceptional weeks). h, Heat map showing the odds ratios (two-sided 
Fisher’s exact test) for co-detection of the indicated transcription factors 
based on the scRNA-seq data in resistant cells (n = 6,379). i, Immunoblot 
analysis of the indicated proteins in sensitive and resistant cells expressing 
control (shCtrl) or BORIS (shBORIS-3 and -4) shRNAs. j, k, RT-PCR 
analysis of the indicated genes (j) and ChIP-qPCR analysis of BORIS 
binding at the promoter regions of BORIS and NEUROG2 (k) in sensitive 
and resistant SK-N-BE(2) neuroblastoma cells. Data are mean + s.d., 

n= 3 biological replicates in j and k (*P < 0.05; **P < 0.01; ***P < 0.001; 
unpaired two-sided t-tests). All other panels except g and h depict data from 
n = 2 biological replicates (for gel source data, see Supplementary Fig. 1). 
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Extended Data Fig. 9 | The proneural transcription factor network in 
resistant cells is sensitive to BRD4 inhibition. a, Metaplots showing the 
correlation between BRD4 and BORIS co-occupancies at the promoter 
regions (+ 2 kb) of the 89 top-ranked genes in resistant versus sensitive 
cells based on the features in Fig. 4b (r, Spearman correlation coefficient). 
b, Immunoblot analysis of BRD4 and cleaved PARP expression in sensitive 
and resistant cells expressing control (shCtrl) or BRD4 (shBRD4-A and 
-B) shRNAs. c, Immunoblot analysis of the indicated proteins in sensitive 
and resistant cells treated with DMSO, TAE684 (1 11M) or JQ1 (2.5 1M) 
for 48 h. d, Dose-response curves for sensitive and resistant cells treated 
with JQ1 or I-BET726 (JQ1 (ICs values: sensitive, 4,798 nM; resistant, 
645 nM); I-BET726 (IC;v values: sensitive, 6,203 nM; resistant, 347 nM)). 
Data are mean + s.d., n = 3 biological replicates. e, Box plots comparing 
the expression of the transcription factors listed in Fig. 4b (n = 13) with 
that of all genes (n = 18,038) in sensitive versus resistant cells (left), 

and between DMSO and JQ1-treated resistant cells (right) (P values 
determined by two-sided Wilcoxon rank-sum test). f, ChIP-seq tracks of 
the indicated proteins at the SIX1 or SIX4 locus in sensitive, resistant and 
JQ1-treated resistant cells (2.5 {uM for 48 h). Super-enhancers are depicted 
as coloured rectangles below the tracks. Signal intensity is shown in the top 
left corner for each track. g, h, Tumour volumes (g) and survival curves 
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(h) in sensitive- and resistant-cell xenografts in NU/NU (Crl:NU-Foxn1"") 
mice treated with JQ1 (50 mg kg”! i.p. once daily) and vehicle control 

for up to 87 days. Data are mean + s.e.m., n = 6 per arm. Significance 

was calculated by Mann-Whitney U test for tumour volumes (sensitive: 
P= 0.3231; resistant: P = 0.0023) and by log-rank test for Kaplan-Meier 
survival analysis (sensitive: P = 0.3047; resistant: 0.0348), both two-sided. 
i, Heat map of gene expression values in sensitive, resistant and JQ1-treated 
resistant cells. Rows are z-scores calculated for each gene in each 
condition. j, Number of transcripts in sensitive, JQ1-treated resistant, 
shBORIS-expressing resistant and resistant cells based on expression array 
data after spike-in normalization. k, Scatter plot displaying the median- 
scaled fold-change gene expression values for shBORIS and JQ1-treated 
resistant cells. The top-ranked transcription factors that show decreased 
expression levels after both BORIS knockdown and JQ] treatment are 
listed in red (bottom left quadrant). The pie chart represents the fraction 
of all top-ranked transcription factors that are located in the left lower 
quadrant of the scatter plot. All box plots are as defined in Fig. 4. In b, c 
and f, data are representative of two independent experiments. In a, e and 
i-k, data are from n = 2 biological replicates (see Supplementary Note 2 
for further details; for gel source data, see Supplementary Fig. 1). 
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Extended Data Fig. 10 | Aberrantly expressed BORIS binds to 
regulatory regions and is associated with new super-enhancers in Ewing 
sarcoma cells. a, Immunoblot analysis of BORIS expression in TC-32 
(pre-chemotherapy), TC-71 and CHLA-10 (relapsed, post-chemotherapy) 
Ewing sarcoma cells, compared with BORIS expression in resistant (Kelly) 
neuroblastoma cells. b, Meta-analysis of average BORIS ChIP-seq signals 
in RPM per bp at all combined BORIS-binding sites for TC-32 and TC-71 
cells. c, Meta-analysis of average BORIS, H3K27ac and SMC1A ChIP-seq 
signals in RPM per bp at TC-71-specific BORIS-binding sites. d, Pie chart 
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depicting the proportions of genomic regions bound by BORIS in TC-71 
cells. The regions shown are promoters (TSS + 2 kb), typical and super- 
enhancers (H3K27ac), and other (if peaks were not assigned to any of 

the previous categories). e, Bar plot showing the odds ratios (two-sided 
Fisher’s exact test) of BORIS localization to regulatory genomic regions in 
TC-71 cells. All panels are representative of two independent experiments 
(see Supplementary Note 3 for further details; for gel source data, 

see Supplementary Fig. 1). 
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Reporting Summary 


Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency 
in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist. 


Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection Data collection was done with BD CellQuest Pro software (BD Biosciences) for the flow cytometry experiment. 
The Cell Ranger Single Cell Software Suite (v1.3) was used to perform sample de-multiplexing, barcode and UMI processing, and singlecell 
3' gene counting for the single cell RNA-sequencing experiment. 


Data analysis R (v3.1.3), affy (v1.44.0), limma (v3.22.7), Fastqc (vO.11.5), STAR (v2.5.1b_modified), Samtools (v1.3.1), MarkDuplicates (v2.1.1), 
deepTools (v2.2.4), MACS2 (v2.1.1), bedGraphToBigWig (v4), R (v3.5.1), Rstudio (v1.1.463), data.table (v1.12.2), trimmomatic (v0.36), 
HiC-Pro (v2.10.0), hichipper (v0.7.3), diffloop (v1.10.0), ROSE (v1), HOMER (v2) (http://homer.ucsd.edu/homer/motif/), Biostrings 
(v2.50.1),Cell Ranger (v1.3), Scater (v1.10.0), Seurat (v2.3.4), Circlize (v0.4.5), IGV (v2.3.74), FlowJo (v10.0.5) and GraphPad Prism (v7.02). 
Custom code is available upon reasonable request. 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


The microarray, ChIP-seq, HiChIP and scRNA-seq datasets generated and analyzed during the current study are available in the Gene Expression Omnibus repository 
under accession number GSE103084. The authors declare that all other data supporting the findings of this study are available within the paper and its 
Supplementary Information files. 
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Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


Xx Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size No statistical methods were used to predetermine sample sizes. Sample sizes were chosen in order to be able to perform statistical analyses, 
as is standard in the field 


Data exclusions No data were excluded from the analyses. 


Replication To verify the reproducibility of our findings, experiments were performed using at least three biological replicates, unless clearly stated 
otherwise in the figure legends. All attempts at replication were successful. 


Randomization _ For in vivo mice experiments, mice were randomized into groups of equal average volumes. 


Blinding For experiments involving human research participants, investigators were blinded to group allocation during data collection. For in vivo mice 
experiments, investigators were not blinded to group allocation during data collection. For experiments involving cell culture, blinding did not 
apply. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 

n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Clinical data 


Antibodies 


Antibodies used The following antibodies were used (all commercially available): 


From Cell Signaling Technology (Danvers, MA, USA): 
_ N-Myc (#9405, lot 2, 1:1,000); 
. N-Myc (#51705, Clone D4B2Y, lot 1); 
eaved PARP Asp214 (#9541, lot 15, 1:4,000); 
Cleaved caspase 3 Asp175 (#9661, lot 45, 1:500); 
ALK (#3333, Clone C26G7, lot 7, 1:1,000); 
Akt (pan) (#4691, Clone C67E7, lot 20, 1:4,000); 
ospho-Akt Thr308 (#9275, lot 26, 1:1,000); 
ospho-Akt Ser473 (#9271, lot 14, 1:2,000); 
44/42 MAPK (ERK1/2) (#4695, Clone 13765, lot 21, 1:4,000); 
Phospho- p44/42 MAPK (ERK1/2) (Thr202/Tyr204) (#4377, Clone 197G2, lot 10, 1:4,000); 
. 86 Ribosomal Protein (#2217, Clone 5G10, lot 5, 1:8,000); 
Phospho-S6 Ribosomal Protein (Ser235/236) (#4857, Clone 91B2, lot 2, 1:8,000); 
. Stat3 (#4904, Clone 79D7, lot 7, 1:2,000); 
. Phospho-Stat3 (Tyr705) (#9131, lot 30, 1:1,000); 
DR1/ABCB1 (#12683, Clone D3H1Q, lot 2, 1:5,000); 
. Sox2 (#3579, Clone D6D9, lot 8, 1:2,000); 
17. B-Actin (#4967, multiple lots, 1:5,000); 
18. CTCF (#3417, Clone D1A7, lot 1); 
19. Normal Rabbit IgG (#2729, lot 7) 
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Validation 


20. Anti-mouse IgG, HRP-linked (#7076, lot 33, 1:2,000). 


From Santa Cruz Biotechnology (Santa Cruz, CA, USA): 
21. Anti-rabbit IgG-HRP (sc-2357, multiple lots, 1:5,000). 


From Bethyl Laboratories (Montgomery, TX, USA): 
22. BRD4 (A301-985A100, lot 6, 1:30,000); 
23. SMC1A (A300-055A, lot 6). 


From Millipore (Billerica, MA, USA): 

24. CTCF (#07-729, lot 2887267, 1:5,000); 
25. SOX9 (#AB5535, lot 2847051, 1:10,000); 
26. H3K27me3 (#07-449, lot 2972864). 


From Abcam (Cambridge, MA, USA): 

27. ALK (phospho Y1507) (ab73996, lot GR57100-14, 1:1,000); 
28. BORIS (ab187163, Clone EP12204, lot GR228943-6); 

29. H3K27ac (ab2729, lot GR3198866-1). 


From NOVUS Biologicals (Littleton, CO, USA): 
30. BORIS (NBP2-52405, Clone 20B11, lot CRT/17/124). 


From Active Motif (Carlsbad, CA, USA): 
31. BORIS (#39851, lot 18916002, 1:3,000). 


From Sigma-Aldrich (Saint Louis, MO, USA): 
32. SIX1 (HPAO01893, lot 1114662, 1:4,000). 


From Abbott (Abbott Park, IL, USA): 


33. Vysis LSI N-MYC (2p24) SpectrumGreen/Vysis CEP 2 SpectrumOrange Probe (07J72-001). 
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ps://www.bethyl.com/product/A300-055A/SMC1+Antibod 


p://www.emdmillipore.com/US/en/product/Anti-Sox9-Ant 
p://www.emdmillipore.com/US/en/product/Anti-trimethy 


ps://www.abcam.com/boris-antibody-ep12204-ab187163 
ps://www.abcam.com/histone-h3-acetyl-k27-antibody-chi 


ibodies/cleaved-parp-asp214-antibody-human-specific/9541 
ibodies/cleaved-caspase-3-asp175-antibody/9661 
ibodies/alk-c26g7-rabbit-mab/3333 
ibodies/akt-pan-c67e7-rabbit-mab/4691 
ibodies/phospho-akt-thr308-antibody/9275 
ibodies/phospho-akt-ser473-antibody/9271 


42-mapk-erk1-2-137f5-rabbit-mab/4695 


bodies/phospho-p44-42-mapk-erk1-2-thr202-tyr204-197g2-rabbit- 


bodies/s6-ribosomal-protein-5g10-rabbit-mab/2217 
bodies/phospho-s6-ribosomal-protein-ser235-236-91b2-rabbit-mab/4857 
bodies/stat3-79d7-rabbit-mab/4904 
bodies/phospho-stat3-tyr705-antibody/9131 


r1-abcb1-d3h1q-rabbit-mab/12683 
2-d6d9-xp-rabbit-mab/3579 
ctin-antibody/4967 


bodies/ctcf-d1a7-xp-rabbit-mab/3417 
bodies/normal-rabbit-igg/2729 
ps://www.cellsignal.com/products/secondary-antibodies/anti-mouse-igg-hrp-linked-antibody/7076 
ps://www.scbt.com/scbt/product/mouse-anti-rabbit-igg-hrp 
ps://www.bethyl.com/product/A301-985A100/BRD4+Antibody 


y 


p://www.emdmillipore.com/US/en/product/Anti-CTCF-Antibody, MM_NF-07-729 


ibody, MM_NF-AB5535 
-Histone-H3-Lys27-Antibody, MM_NF-07-449 


ps://www.abcam.com/alk-phospho-y1507-antibody-ab73996.html 
-html 


p-grade-ab4729.html 


ps://www.novusbio.com/products/boris-antibody-20b11_nbp2-52405 
ps://www.activemotif.com/catalog/details/39851/boris-ctcfl-antibody-pab 
ps://www.sigmaaldrich.com/catalog/product/sigma/hpa001893 ?lang=en&region=US 


ps://www.molecular.abbott/sal/en-us/staticAssets/AMD-U 


S-Oncology-and-Genetics-Catalog.pdf 
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Eukaryotic cell lines 


Policy information about cell lines 
Cell line source(s) Human NB cell lines Kelly and SK-N-BE(2) and human Ewing sarcoma cell lines TC-32, TC-71 and CHLA-10 were obtained from 
the Children’s Oncology Group cell line bank (Lubbock, TX, USA). Human NB cell line SK-N-SH and human embryonic kidney 
cell line HEK293T were obtained from the American Type Culture Collection (Manassas, VA, USA). 
Authentication All cell lines have been authenticated by short tandem repeat analysis. 


Mycoplasma contamination All cell lines tested negative for mycoplasma. 


Commonly misidentified lines No commonly misidentified cell lines were used. 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals NU/NU (Crl:NU-Foxn1nu) mice (6-8 weeks old female) were purchased from Charles River Laboratories. NU/NU (CrTac:NCr- 
Foxninu) mice (6-8 weeks old female) were purchased from Taconic. 


Wild animals The study did not involve wild animals. 
Field-collected samples The study did not involve samples collected from the field. 
Ethics oversight All animal experiments were performed with approval from the Institutional Animal Care and Use Committee of the DFCI. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics Data involving human research participants pertain to immunohistochemistry of formalin-fixed paraffin-embedded tumor tissue 
slides, and were de-identified prior to analysis. 


Recruitment Data involving human research participants pertain to immunohistochemistry of formalin-fixed paraffin-embedded tumor tissue 
slides, and were de-identified prior to analysis. 


Ethics oversight All human tumor specimens (formalin-fixed paraffin-embedded slides) were obtained under an Institutional Review Board- 
approved protocol of the Dana-Farber/Boston Children's Cancer and Blood Disorders Center. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


ChIP-seq 


Data deposition 


Confirm that both raw and final processed data have been deposited in a public database such as GEO. 


Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks. 


Data access links https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE103084 
May remain private before publication. 


Files in database submission ChIPseq_Original_Submission_sens_res 
Raw files: 
Par_BORIS_1.fastq.gz 
Par_BORIS_2.fastq.gz 
Par_BRD4_1.fastq.gz 
Par_BRD4_2.fastq.gz 
Par_CTCF_1.fastq.gz 
Par_CTCF_2.fastq.gz 
Par_H3K27ac_1.fastq.gz 
Par_H3K27ac_2.fastq.gz 
Par_H3K27me3_1.fastq.gz 
Par_H3K27me3_2.fastq.gz 
Par_input_1.fastq.gz 
Par_input_2.fastq.gz 
Par_Pol2_1.fastq.gz 
Par_Pol2_2.fastq.gz 
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Res_BORIS_1.fastq.gz 
Res_BORIS_2.fastq.gz 
Res_BRD4_1.fastq.gz 
Res_BRD4_2.fastq.gz 
Res_CTCF_1.fastq.gz 
Res_CTCF_2.fastq.gz 
Res_H3K27ac_1.fastq.gz 
Res_H3K27ac_2.fastq.gz 
Res_H3K27me3_1.fastq.gz 
Res_H3K27me3_2.fastq.gz 
Res_input_1.fastq.gz 
Res_input_2.fastq.gz 
Res_Pol2_1.fastq.gz 
Res_Pol2_2.fastq.gz 
Res_JQ1_BORIS_1.fastq.gz 
Res_JQ1_BORIS_2.fastq.gz 
Res_JQ1_BRD4_1.fastq.gz 
Res_JQ1_BRD4_2.fastq.gz 
Res_JQ1_CTCF_1.fastq.gz 
Hi2.fastq.gz 
Res_JQ1_H3K27ac_1.fastq.gz 
Res_JQ1_H3K27ac_2.fastq.gz 
Res_JQ1_H3K27me3_1.fastq.gz 
Res_JQ1_H3K27me3_2.fastq.gz 
Res_JQ1_input_1.fastq.gz 
Res_JQ1_input_2.fastq.gz 
Processed files: 
norm_Par_BORIS_merged.noNeg.sorted.bw 
norm_Par_BRD4_merged.noNeg.sorted.bw 
norm_Par_CTCF_merged.noNeg.sorted.bw 
norm_Par_H3K27ac_merged.noNeg.sorted.bw 
norm_Par_H3K27me3_merged.noNeg.sorted.bw 
norm_Par_Pol2_2.noNeg.sorted.bw 
norm_Res_BORIS_merged.noNeg.sorted.bw 
norm_Res_BRD4_merged.noNeg.sorted.bw 
norm_Res_CTCF_merged.noNeg.sorted.bw 
norm_Res_H3K27ac_merged.noNeg.sorted.bw 
norm_Res_H3K27me3_merged.noNeg.sorted.bw 
norm_Res_Pol2_2.noNeg.sorted.bw 
norm_Res_JQ1_BORIS_merged.noNeg.sorted.bw 
norm_Res_JQ1_BRD4_merged.noNeg.sorted.bw 
norm_Res_JQ1_CTCF_merged.noNeg.sorted.bw 
norm_Res_JQ1_H3K27ac_merged.noNeg.sorted.bw 
norm_Res_JQ1_H3K27me3_merged.noNeg.sorted.bw 
Par_BORIS_vs_Par_input_peaks.narrowPeak 
Par_BRD4_vs_Par_input_peaks.narrowPeak 
Par_CTCF_vs_Par_input_peaks.narrowPeak 
Par_narrow_Pol2_vs_Par_input_peaks.narrowPeak 
Res_BORIS_vs_Res_input_peaks.narrowPeak 
Res_BRD4_vs_Par_input_peaks.narrowPeak 
Res_CTCF_vs_Res_input_peaks.narrowPeak 
Res_narrow_Pol2_vs_Res_input_peaks.narrowPeak 
Res_JQ1_BORIS_vs_Res_JQ1_input_peaks.narrowPeak 
Res_JQ1_BRD4_vs_Res_JQ1_input_peaks.narrowPeak 
Res_JQ1_CTCF_vs_Res_JQ1_input_peaks.narrowPeak 
Par_H3K27ac_vs_Par_input_peaks.broadPeak 
Par_H3K27me3_vs_Par_input_peaks.broadPeak 
Res_H3K27ac_vs_Res_input_peaks.broadPeak 
Res_H3K27me3_vs_Res_input_peaks.broadPeak 
Res_JQ1_H3K27ac_vs_Res_JQ1_input_peaks.broadPeak 
Res_JQ1_H3K27me3_vs_Res_JQ1_input_peaks.broadPeak 
ChIPseq_MYCN 

Raw files: 
Kelly_input_MYCN_1.fastq.gz 
Kelly_MYCN_CS_1.fastq.gz 
Kelly_MYCN_CS_2.fastq.gz 
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Res_Kelly_input_MYCN_CS_1.fastq.gz 
Res_Kelly_input_MYCN_CS_2.fastq.gz 
Res_Kelly_MYCN_CS_1.fastq.gz 
Res_Kelly_MYCN_CS_2.fastq.gz 


Processed files: 


Par_Kelly_MYCN.noNeg.sorted.bw 
Res_Kelly_MYCN.noNeg.sorted.bw 
Par_Kelly_MYCN_vs_Par_Kelly_MYCN_INPUT_peaks.narrowPeak 
Res_Kelly_MYCN_vs_Res_Kelly_MYCN_INPUT_peaks.narrowPeak 


ChIPseq_KD_Boris 

Raw files: 
hB4_Res_Kelly_BORIS_1.fastq.gz 
hB4_Res_Kelly_BORIS_2.fastq.gz 
hB4_Res_Kelly_CTCF_1.fastq.gz 
hB4_Res_Kelly_CTCF_2.fastq.gz 
hB4_Res_Kelly_H3K27ac_1.fastq.gz 
hB4_Res_Kelly_H3K27ac_2.fastq.gz 
hB4_Res_Kelly_INPUT_1.fastq.gz 
hB4_Res_Kelly_INPUT_2.fastq.gz 
hB4_Res_Kelly_INPUT_BORIS_1.fastq.gz 
hB4_Res_Kelly_INPUT_BORIS_2.fastq.gz 
hB4_Res_Kelly_ SMC1_1.fastq.gz 
hB4_Res_Kelly_SMC1_2.fas 


q.gz 


hLUC_Res_Kelly_BORIS_1.fastq.gz 
hLUC_Res_Kelly_BORIS_2.fastq.gz 
hLUC_Res_Kelly_CTCF_1.fastq.gz 
hLUC_Res_Kelly_CTCF_2.fastq.gz 
hLUC_Res_Kelly_H3K27ac_1.fastq.gz 
hLUC_Res_Kelly_H3K27ac_2.fastq.gz 
hLUC_Res_Kelly_INPUT_1.fastq.gz 
hLUC_Res_Kelly_INPUT_2.fastq.gz 
hLUC_Res_Kelly_SMC1_1.fastq.gz 
hLUC_Res_Kelly_SMC1_2.fastq.gz 
rocessed files: 


hB4_Res_Kelly_BORIS.noNeg.sorted.bw 
hB4_Res_Kelly_CTCF.noNeg.sorted.bw 
hB4_Res_Kelly_H3K27ac.noNeg.sorted.bw 
hB4_Res_Kelly_SMC1.noNeg.sorted.bw 
hLUC_Res_Kelly_BORIS.noNeg.sorted.bw 
hLUC_Res_Kelly_CTCF.noNeg.sorted.bw 
hLUC_Res_Kelly_H3K27ac.noNeg.sorted.bw 
hLUC_Res_Kelly_SMC1.noNeg.sorted.bw 
hB4_Res_Kelly_BORIS_vs_shB4_Res_Kelly_INPUT_peaks.narrowPeak 
hB4_Res_Kelly_CTCF_vs_shB4_Res_Kelly_INPUT_peaks.narrowPeak 
hB4_Res_Kelly_H3K27ac_vs_shB4_Res_Kelly_INPUT_peaks.broadPeak 
hB4_Res_Kelly_SMC1_vs_shB4_Res_Kelly_INPUT_peaks.narrowPeak 
hLUC_Res_Kelly_BORIS_vs_shLUC_Res_Kelly_INPUT_peaks.narrowPeak 
hLUC_Res_Kelly_CTCF_vs_shLUC_Res_Kelly_INPUT_peaks.narrowPeak 
hLUC_Res_Kelly_H3K27ac_vs_shLUC_Res_Kelly_INPUT_peaks.broadPeak 
hLUC_Res_Kelly_ SMC1_vs_shLUC_Res_Kelly_INPUT_peaks.narrowPeak 
ChiPseq_Ewing Sarcoma 

Raw files: 

TC32_BORIS_1.fastq.gz 

TC32_BORIS_2.fastq.gz 

TC32_H3K27ac_1.fastq.gz 

TC32_H3K27ac_2.fastq.gz 

TC32_input_1.fastq.gz 

TC32_input_2.fastq.gz 

TC32_SMC1_1.fastq.gz 

TC32_SMC1_2.fastq.gz 

TC71_BORIS_1.fastq.gz 

TC71_BORIS_2.fastq.gz 

TC71_H3K27ac_1.fastq.gz 

TC71_H3K27ac_2.fastq.gz 

TC71_input_1.fastq.gz 

TC71_input_2.fastq.gz 

TC71_SMC1_1.fastq.gz 

TC71_SMC1_2.fastq.gz 

Processed files: 
Ewings_TC32_B 
Ewings_TC32_H 
Ewings_TC32_S 
Ewings_TC71_B 
Ewings_TC71_H 
Ewings_TC71_S 
Ewings_TC32_B 


$ 
i} 
Ss 
Ss 
$ 
Ss 
Ss 
Ss 
Ss 
Ss 
s 
Ss 
Ss 
Ss 
Ss 
Ss 
Ss 
Ss 
Ss 
Ss 
Ss 
Ss 
Pp 
Ss 
Ss 
Ss 
Ss 
s 
Ss 
Ss 
Ss 
Ss 
$ 
Ss 
$ 
Ss 
Ss 
s 
Ss 


ORIS.noNeg.sorted.bw 
3K27ac.noNeg.sorted.bw 
MC1.noNeg.sorted.bw 
ORIS.noNeg.sorted.bw 
3K27ac.noNeg.sorted.bw 
MC1.noNeg.sorted.bw 
ORIS_vs_David_Ewings_TC32_INPUT_peaks.narrowPeak 
Ewings_TC32_H3K27ac_vs_David_Ewings_1C32_INPUT_peaks.broadPeak 
Ewings_TC32_SMC1_vs_David_Ewings_TC32_INPUT_peaks.narrowPeak 
Ewings_TC71_BORIS_vs_David_Ewings_TC71_INPUT_peaks.narrowPeak 
Ewings_TC71_H3K27ac_vs_David_Ewings_1TC71_INPUT_peaks.broadPeak 
Ewings_TC71_SMC1_vs_David_Ewings_TC71_INPUT_peaks.narrowPeak 
Microarray_ Original_Submission_sens_res_res-JQ1 

Raw files: 

DD_1_PrimeView_par_Kelly_DMSO.CEL 
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DD_2_PrimeView_par_Kelly_DMSO.CEL 
DD_5_PrimeView_Res_Kelly_TAE.CEL 
DD_6_PrimeView_Res_Kelly_TAE.CEL 
DD_7_PrimeView_Res_Kelly_JQ1.CEL 
DD_8 PrimeView_Res_Kelly_JQ1.CEL 

icroarray_KD_Boris 
Raw files: 
Res_Kelly_shBORIS_1.CEL 
Res_Kelly_shBORIS_2.CEL 
Res_Kelly_shLUC_1.CEL 
Res_Kelly_shLUC_2.CEL 
10X_sens_IR_Res 
Raw files: 
Sens.possorted_genome_bam.bam 
ntermediate_Res.possorted_genome_bam.bam 
Full_Res.possorted_genome_bam.bam 
Processed files: 
Sens_matrix.mtx 
ntermediate_Res_matrix.mtx 
Full_Res_matrix.mtx 
barcodes.tsv 
genes.tsv 
HiChIP_sens_res 
Raw files: 
KellyRes1_1.fastq.tar.gz 
KellyRes1_2.fastq.tar.gz 
KellyRes2_1.fastq.tar.gz 
KellyRes2_2.fastq.tar.gz 

y' 

y' 
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KellyRes3_1.fastq.tar.gz 
KellyRes3_2.fastq.tar.gz 
KellyWTrep1_1.fastq.bz2 
KellyWTrep1_2.fastq.bz2 
KellyWTrep2_1.fastq.bz2 
KellyWTrep2_2.fastq.bz2 
KellyWTrep3_1.fastq.bz2 
KellyWTrep3_2.fastq.bz2 
Processed files: 
Kelly-Res-SMC1.bedpe 
Kelly-Res-SMC1-petcount.bedpe 
Kelly-SMC1.bedpe 
Kelly-SMC1-petcount.bedpe 
HiChIP_KD_Boris 

Raw files: 
Kelly_shBORIS1_1.fastq.tar.gz 
Kelly_shBORIS1_2.fastq.tar.gz 
Kelly_shBORIS2_1.fastq.tar.gz 
Kelly_shBORIS2_2.fastq.tar.gz 
Kelly_shBORIS3_1.fastq.tar.gz 
Kelly_shBORIS3_2.fastq.tar.gz 
Kelly_shLUC1_1.fastq.tar.gz 
Kelly_shLUC1_2.fastq.tar.gz 
Kelly_shLUC2_1.fastq.tar.gz 
Kelly_shLUC2_2.fastq.tar.gz 
Kelly_shLUC3_1.fastq.tar.gz 
Kelly_shLUC3_2.fastq.tar.gz 
Processed files: 
shBORIS-results.csv 
shLUC-results.csv 


io) 


io) 


Genome browser session No longer applicable. 
(e.g. UCSC) 


Methodology 


Replicates For each mark assessed, two biological replicates were performed. All ChiP-seq data are derived from the analysis of both 
replicates. 
Sequencing depth ChIP-seq libraries were generated using the NEBNext Ultra Il DNA Library Prep kit (E7645), following the manufacturer's 


instructions. Starting DNA material ranged from 2.5 to 10ng, and PCR amplification (8 to 10 cycles) was performed using 
NEBNext Multiplex Oligos for Illumina (E7335 and E7500) with distinct indices to allow for multiplexing of up to 12 samples 
to be run on the Illumina NextSeq 500 for 75 bases in single-read mode. 

Sample Total_reads Uniquely_mapped_reads 

Par_BORIS_1.fastg.gZ 52592735 39434783 

Par_BORIS_2.fastq.gz 54031826 40040869 


Par_BRD4_1.fastg.gz 32913499 24145172 
Par_BRD4_2.fastq.gz 34899996 25050068 
Par_CTCF_1.fastq.gz 32357383 21742074 
Par_CTCF_2.fastq.gz 38048050 25570483 
Par_H3K27ac_1.fastq.gz 34652785 31481286 
Par_H3K27ac_2.fastq.gz 36652655 33723955 
Par_H3K27me3_1.fastq.gz 35375006 28384362 
Par_H3K27me3_2.fastq.gz 33791944 27655558 
Par_Pol2_1.fastq.gz 24373861 6139176 
Par_Pol2_2.fastq.gz 41084548 2685056 
Kelly_MYCN_CS_1.fastq.gz 22544426 19447997 
Kelly_MYCN_CS_2.fastq.gz 43975115 38012488 
Par_input_1.fastq.gz 74638885 66630061 
Par_input_2.fastq.gz 40772400 36361135 
Kelly_input_MYCN_1.fastq.gz 43184072 34957425 
Res_BORIS_1.fastq.gz 43137736 32555678 
Res_BORIS_2.fastq.gz 47143635 35144692 
Res_BRD4_1.fastq.gz 32159271 24332339 
Res_BRD4_2.fastq.gz 36755542 30051197 
Res_CTCF_1.fastq.gz 40911369 28372298 
Res_CTCF_2.fastq.gz 36878692 25628082 
Res_H3K27ac_1.fastq.gz 33542051 30727706 
Res_H3K27ac_2.fastq.gz 37596270 34409319 
Res_H3K27me3_1.fastq.gz 35238601 29098617 
Res_H3K27me3_2.fastq.gz 35026938 28826263 
Res_Pol2_1.fastq.gz 35817388 6953374 

Res Pol2_2.fastq.gz 48429363 2674818 
Res_Kelly_MYCN_CS_1.fastq.gz 18821385 14198653 
Res_Kelly_MYCN_CS_2.fastq.gz 30543884 24497611 
Res_Kelly_input_MYCN_CS_1.fastq.gz 77322913 68513199 
Res_Kelly_input_MYCN_CS_2.fastq.gz 53883106 47951287 
Res_input_1.fastq.gz 45072526 40107804 
Res_input_2.fastq.gz 49535965 44608902 
Res_JQ1_BORIS_1.fastq.gz 81479030 61882578 
Res_JQ1_BORIS_2.fastq.gz 64057634 49039377 
Res_JQ1_BRD4_1.fastq.gz 32714245 25430455 
Res_JQ1_BRD4_2.fastq.gz 33031805 25335103 
Res_JQ1_CTCF_1.fastq.gz 33287924 24508642 
Res_JQ1_CTCF_2.fastq.gz 39674176 29154183 
Res_JQ1_H3K27ac_1.fastq.gz 40677082 37149902 
Res_JQ1_H3K27ac_2.fastq.gz 32078957 29293906 
Res_JQ1_H3K27me3_1.fastq.gz 36342163 30007069 
Res_JQ1_H3K27me3_2.fastq.gz 37930668 31338450 
Res_JQ1_input_1.fastq.gz 49313027 43993392 
Res_JQ1_input_2.fastq.gz 43284562 38433644 
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Antibodies The following antibodies were used: MYCN (#51705, Cell Signaling Technology), BRD4 (A301-985A100, lot 6, Bethyl 
Laboratories), CTCF (#07-729, lot 2887267, Millipore), H3K27me3 (#07-449, lot 2972864, Millipore), H3K27ac (ab2729, lot 
GR3198866-1, Abcam), and BORIS ( #39851, lot 18916002, Active Motif). 

All antibodies were previously validated by their manufacturers. 


Peak calling parameters Samples were aligned to the human genome (build hg19, GRCh37.75) with STAR (v2.5.1b_modified) and the parameters “-- 
alignintronMax 1 --alignEndsType EndToEnd--outFilterMultimapNmax 1 --outFilterMismatchMax 5”. Next, non-duplicate 
reads that mapped to the reference chromosomes were retained using Samtools (v1.3.1) and MarkDuplicates (v2.1.1) from 
Picard tools. Peaks were identified with MACS2 (2.1.1) for narrow peaks with the parameters ”--q 0.01--call-summits” and 
for broad peaks with the parameters ”--broad-cutoff 0.01”. 


Data quality 1) Peaks overlapping regions with known artefact regions (http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/) 
were blacklisted out. 
2) Antibody enrichment was assessed using the plotFingerprint command from deepTools (v2.2.4). 
3) Correlation of replicates was assessed with the deepTools command “multiBigwigSummary BED-file” using all bigwigs and 
identified peaks as input. 
4) broad peaks with 5-fold enrichment, 5% FDR and after blacklisting 
Par_H3K27ac_vs_Par_input_peaks.broadPeak.blacklisted 15626 
Par_H3K27me3_vs_Par_input_peaks.broadPeak.blacklisted 3946 
Res_H3K27ac_vs_Res_input_peaks.broadPeak.blacklisted 16725 
Res_H3K27me3_vs_Res_input_peaks.broadPeak.blacklisted 2293 
Res_JQ1_H3K27ac_vs_Res_JQ1_input_peaks.broadPeak.blacklisted 15942 
Res_JQ1_H3K27me3_vs_Res_JQ1_input_peaks.broadPeak.blacklisted 1819 
5) narrow peaks with 5-fold enrichment, 5% FDR and after blacklisting 
Par_BORIS_vs_Par_input_peaks_blacklisted.narrowPeak 1055 
Par_BRD4_vs_Par_input_peaks_blacklisted.narrowPeak 52158 
Par_CTCF_vs_Par_input_peaks_blacklisted.narrowPeak 69566 
Par_narrow_Pol2_vs_Par_input_peaks_blacklisted.narrowPeak 19193 
Par_MYCN_vs_Par_input_peaks_blacklisted.narrowPeak 36047 


Res_JQ1_CTCF_vs_Res_JQ1_inp 


Res_BORIS_vs_Res_input_peaks_blacklisted.narrowPeak 17035 
Res_BRD4_vs_Res_input_peaks_blacklisted.narrowPeak 26297 
Res_CTCF_vs_Res_input_peaks_blacklisted.narrowPeak 58724 
Res_JQ1_BORIS_vs_Res_JQ1_input_peaks_blacklisted.narrowPeak 12674 
Res_JQ1_BRD4_vs_Res_JQ1_input_peaks_blacklisted.narrowPeak 9702 


ut_peaks_blacklisted.narrowPeak 37791 


Res_narrow_Pol2_vs_Res_input_peaks_blacklisted.narrowPeak 54174 
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1), Rstudio (v1.1.463), data.table (v1.12.2), trimmomatic (v0.36), HiC-Pro (v2.10.0), 


hichipper (v0.7.3), diffloop (v1.10.0), ROSE (v1), Circlize (v0.4.5), IGV (v2.3.74). Custom code is available upon reasonable 
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Flow Cytometry 
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Methodology 


Sample preparation Flow cytometry was used for cell cyc 
fixation of 1x10e6 cells overnight at 4C with 80% ethanol, cells were resuspended in PBS supplemented with 0.1% Triton X-100, 
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Instrument 

Software Data collection was done with BD Ce 
(v10.0.5). 

Cell population abundance No FACS sorting was performed for t 

Gating strategy Cell debris as well as non-singlets we 
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, and incubated for 45 min at 37C in the dark before analysis. 
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IQuest Pro software (BD Biosciences), and analysis was performed using FlowJo software 
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CORRECTIONS & AMENDMENTS 


CORRECTION 
https://doi.org/10.1038/s41586-019-1378-x 


Author Correction: Diverse 

and robust molecular algorithms 
using reprogrammable 

DNA self-assembly 


Damien Woods, David Doty, Cameron Myhrvold, Joy Hui, 
Felix Zhou, Peng Yin & Erik Winfree 


Correction to: Nature https://doi.org/10.1038/s41586-019-1014-9, 
published online 20 March 2019. 


In Fig. 1 of this Letter, prime symbols were erroneously included in 
some labels in panels c and d. In the bottom section of panel ¢, in the 
diagram beneath ‘SST self-assembly; the labels w2a’, w3a’, w4a’ and 
w5a’ should read w2a, w3a, w4a and w5a, respectively. Similarly, in 
panel d, the labels w2a’ and w3a’ should read w2a and w3a, respectively. 
Additionally, there were some omissions in the Acknowledgements: 
R. Schulman should have been thanked for experimental advice, and 
R. Hariadi for contributing to preliminary designs for algorithmic 
self-assembly by SST. Finally, in Supplementary Figs. 8 and 9, the 
rightmost labels s should read s’, and on page 64 of the Supplementary 
Information a citation to Telser et al. (1989) was missing and has been 
added as ref. 89; the subsequent citations have been renumbered. 
The Supplementary Information has been updated accordingly, and 
minor changes have also been made to the phrasing throughout to 
improve clarity. The original, incorrect version of the Supplementary 
Information is included as Supplementary Information to this 
Amendment, for transparency. The original Letter has been corrected 
online. 


Supplementary Information is available in the online version of this Amendment. 
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CORRECTIONS & AMENDMENTS 


CORRECTION 
https://doi.org/10.1038/s41586-019-1454-2 


Publisher Correction: 
Heterochromatin drives 
compartmentalization of inverted 
and conventional nuclei 


Martin Falk, Yana Feodorova, Natalia Naumova, 

Maxim Imakaev, Bryan R. Lajoie, Heinrich Leonhardt, 
Boris Joffe, Job Dekker, Geoffrey Fudenberg, Irina Solovei & 
Leonid A. Mirny 


Correction to: Nature https://doi.org/10.1038/s41586-019-1275-3, 
published online 05 June 2019. 


In this Letter, the x-axis labels of Fig. 3b were inadvertently shifted to 
the left. The original Letter has been corrected online, and Fig. 1 of this 
Amendment shows the original panel, for transparency. 

In addition, in panels b1 and cl of Extended Data Fig. 9, the word 
‘week’ should have been ‘weak’ The original Letter has been corrected 
online. 


Original Fig. 3b Corrected Fig. 3b 
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Fig. 1 | This figure displays the corrected and the incorrect published Fig. 3b of the original Letter. 
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Greta Thunberg, an environmental activist from Sweden, speaks about the climate to politicians, 


COLUMN 


i 


the media and guests at the UK Houses of Parliament. 


How to bring science into politics 


Six ways to gain traction with policymakers, from Hannah Safford and Austin Brown. 


o matter how hard scientists work, our 

| \ | impact will almost always be limited 

to our immediate academic circles if 

our results never catch the attention of those 

who have the power to act on them. These 

people are often policymakers — local, state 

or central-government officials who write 

laws and regulations, craft budgets and govern 
communities. 

But effective collaboration requires strong 
communication. The policy world can be 
tricky to navigate. Institutions can seem 
impenetrable, and decision-making is often 
opaque. Fortunately, simple strategies can 


help scientists to communicate effectively 
with policymakers. 


STRATEGIES FOR SUCCESS 

Know who you want to reach. Communicating 
with top-ranking officials — such as a state 
governor or a US senator — isn't always the 
most effective way to spur change. Perhaps your 
research shows that overflowing storm drains 
are harming a nearby ecosystem. Such a local 
issue isn't likely to rise high on the priority list 
of a senator, but does fall within the remit of 
your state environmental protection agency or 
county board of supervisors. 


Your best partners might even be outside the 
government. Non-profit organizations, industry 
groups, advocacy organizations and private- 
sector companies dont implement public policy 
as such, but certainly shape the debate. If you 
arent sure who you need to reach, ask around! 
Your university's government-relations office 
and colleagues in your field might be able to 
point you in the right direction. 


Have clear and actionable recommenda- 
tions. Providing specific recommendations 
makes it easier for your audience to act. 
Pointing out that more charging stations > 
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me 


> would support a growing demand for 
electric vehicles is a good start, but it’s even 
more helpful to explain exactly where and how 
many are needed. Specificity also prepares you 
to defend your recommendations by forcing 
you to think through the details. 

Your suggestions should be feasible. Every 
government body is constrained by its mission 
and budget. Do your best to propose actions 
that fall within your target agency’s authority. 

Finally, remember that making recommen- 
dations is not the same as advocacy. One of the 
most valuable roles a scientist can have is laying 
out the likely pros and cons of different policy 
options. Whereas an advocate typically exhorts 
a policymaker to ‘do Y; a scientist can marshal 
the best available research to explain that, “If 
you do Y, chances are good that Z will result” 


Repackage your work. The peer-reviewed 
article is the currency of the scientific realm, 
but it’s not going to get you far in policy. A new 
audience demands a new format — one that is 
accessible and understandable. 

Consider synthesizing your key findings 
and recommendations into a two-page policy 
brief that can be distributed easily in person or 
online. Repackaging your work into a publish- 
able blog or opinion piece is also useful when 
youre trying to reach a broader audience. 


Write well. Conversations and presentations 
are great ways to introduce a topic, but policy- 
makers will want a written product to react to 
or to share with colleagues. 


Former US president Barack Obama met with traditional fishers near Dillingham, Alaska, as part of a trip in 2015 to call attention to climate change. 


Organization, brevity and clarity are more 
important than wit or style when it comes to 
policy writing. State your key points first, then 
provide more explanation. Make sure there is a 
clear one-sentence takeaway in the first para- 
graph. Add headings to separate sections, and 
use visual cues, such as bullets, to draw atten- 
tion to key points. Define technical terms and 
spell out acronyms. 

Above all, get someone else to read your 
work. Communicating your science to friends 
(especially non-experts) is the best way to 
get better at communicating your science to 
policymakers. 


Pick your moment. Strategically selecting 
when to engage increases the chance that your 
idea will fall on receptive ears. Electoral and 
legislative calendars can help you to choose a 
good time. Meet- 
ings with elected 


“Providing specific 


officials tend to recommendations 
be much more makesit easier 
effective towards for your audience 
thebeginningofa ftoact.” 

term (when policy 


priorities are being set) than towards the end. 

When in doubt, engage early. By the time a 
bill comes up for a vote, or a rule is in its final 
stages, most policymakers will have been dis- 
cussing it for months or longer. Even highly 
credible input will be unlikely to change 
minds. Look for newsletters and podcasts that 
can help you stay aware of when topics you 
care about are coming up for debate — and 
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none 


talk to legislators before this happens. Submit 
comments on draft rules and participate in 
stakeholder workshops when those are offered. 

Current events can yield extra opportunities 
to advance your work. The value of news- 
generated ‘policy windows’ has been well 
documented. Stay aware of what's going on in 
the world and link your research to it. 


Sustain and amplify your engagement. 
Building support takes time and ongoing 
effort. Partnering with people and institutions 
who have an agenda similar to yours is a great 
way to strengthen your collective case. 

It’s also crucial to follow up. Policy proposals 
evolve as they undergo review, debate and 
public comment. Once you've established 
a relationship with a key player, check in 
periodically to stay abreast of changes and 
update your recommendations accordingly. 
As a bonus, staying in touch demonstrates a 
level of investment that sets you apart from the 
crowd. This, in turn, increases the odds that 
a policymaker will reach out proactively with 
questions, making it even easier for you to stay 
in the loop as policies move forward. = 


Hannah Safford is a PhD student in 
environmental engineering at the University 
of California, Davis, and a researcher with 
the UC Davis Policy Institute for Energy, 
Environment, and the Economy. Austin 
Brown is executive director of the UC Davis 
Policy Institute for Energy, Environment, and 
the Economy. 


JONATHAN ERNST/REUTERS 


IAN GAVAN/GETTY 


In February 2017, Helen Currie joined a team 
of researchers on a tour of UK festivals to share 
the impact of her work with people outside 

her field. Currie studies how sound affects 
migratory-fish behaviour, and is in the fourth 
year of a doctoral programme at the University 
of Southampton, UK. 


How did you break into science communication? 
Some of the events I took part in during the 
first year of my PhD lacked the impact I was 
hoping for. When we just set up under a gazebo 
ina park with a poster, nobody stopped to talk 
to us, which was discouraging. At an ecology 
festival in Southampton, however, my team 
and I found our groove. We received feedback 
from environment-conscious visitors who 
thought that our research into how dams, weirs 
and other structures can damage migratory- 
fish ecosystems would have a real impact on 
assisting the development of river infrastruc- 
ture that is more sustainable. 


How do you engage people with your research? 
After our first few events, we learnt that 
our activities needed to be hands-on and 
interactive, so with the help of others at the Uni- 
versity of Southampton, we created a marble 
run consisting of a custom-built sloped struc- 
ture on which people place several marbles. 
These roll down past a series of barriers, such 
as gates, gaps, side channels and plastic pins. 
The marbles represent fish, and the run itself 
is a river, with barriers representing things that 
are dangerous to the fish, such as hydropower 
turbines or locations where water is purposely 
removed from the river to control flooding or 
for irrigation use. Participants can alter the 
river system to make it more friendly or hostile 
for the fish. We challenge participants to get as 
many fish through the marble run as possible 
while we explain our research. 

We then joined our university’s public- 
engagement ‘roadshow. This is an ongoing 
project that encourages researchers to share 
their work with the public. Every summer, 
the roadshow travels to several events in the 
United Kingdom, including music festivals. 


Which festivals have you attended? 

The biggest one was Glastonbury 2017, a 5-day 
event that drew 135,000 attendees. For the past 
few years, festival organizers have invited UK 
universities to showcase some of their research 
in the science tent. Our team shared the tent 
with a number of other universities. I also 
presented my work at the Green Man, an arts 
festival in Crickhowell, Wales, which welcomes 
researchers and universities to apply for a space 


Festivals are a place to celebrate not only music and art — but also science. 


in a dedicated science area. 


What is it like to be a science presenter at 
these events? 

Itcan be logistically challenging because you're 
in a field, not a building. You really need to 
think about what activity you're bringing with 
you. How heavy is it? Do you need a power 
source? But it’s also great fun. All the present- 
ers at Glastonbury had prepared a key message 
about their project to share with visitors, which 
was simple enough for any other presenters to 
take over and explain the concept. So if we 
wanted to see a band play, we could briefly hand 
over responsibility to another presenter and 
experience some of the festival for ourselves. 
There were a number of great bands there that 
year, but my favourite was Royal Blood. 


Are people surprised to find science at a music 
festival? 

People these days tend to expect more than just 
music. Glastonbury has comedy, poetry read- 
ings and art installations, so a science tent fits 
in well. People come to get an experience, and 
learning just happens to bea secondary out- 
come. And science probably isn’t the strangest 
thing you will find at a music festival. A couple 
of people did ask us why we were there, so we 
explained what we did and the importance of 
giving back. A lot of public funding goes into 
science research. It's good to get out and tell peo- 
ple what their taxes are being spent on and why. 


Who visits the science area? 

It’s so varied. Some people hear there's a science 
tent and actively seek it out. Others wander 
around the festival and just come across it. 
There is a wide range of ages as well, from 


families with young kids to groups of friends, 
older couples and lone travellers. 


How do you switch your message between 
these different visitors? 

People will tell you when they know more about 
your topic. Even when talking to kids, you'd be 
surprised what theyre already aware of. But you 
don't want to pitch too high and risk confus- 
ing or boring someone, so it’s best to start lower 
and increase the level as the conversation goes. 
You can have really long conversations this way. 
Sometimes I walked away having learnt some- 
thing myself. For example, a fisherman I was 
speaking to referred to a three-spined stickle- 
back fish as a ‘bramstickle’ a term I had never 
heard before. I have since found out that in 
the British Isles, there are at least 70 dialectical 
terms for the same species of fish. 


What have you learnt about science 
communication from your festival experience? 
It really helped me to see the big picture of my 
research. I think we sometimes get caught up 
in the ‘nicheness’ of what we do, but talking to 
people at music festivals really helped me to 
break it back down to the key research question. 


What is your advice to colleagues who want to 
share their research at a similar event? 

Just give it a go. You get so much out of it in 
terms of feedback from people. It makes you 
realize that you're having an impact, and the 
conversations you have with people make you 
think about your research in a different way. 
So if you have the opportunity, get involved. m 


INTERVIEW BY EVA AMSEN 


This interview has been edited for length and clarity. 
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Ua SCIENCE FICTION 


SEEDS TRAVEL 


BY BETH GODER 


ajar piloted the mech over vast 
H mountains, through meadows lush 

with grasses that were almost like 
those of Earth, except for their orange tips. 
She travelled through dense forests and 
snowscapes heavy with wind. 

The mech was like an extension of her 
body, never tiring, wrapping around her like 
a seed pod protecting its cargo. 

Hajar reviewed the data from the soil 
samples gathered by the mechs feet. A rich 
composition of minerals in this region, but 
not enough nitrogen for farming. 

In her pocket rested a blue stone. 


When Hajar was six, back on Earth, she 
brought her father a seed. “What's this?” 
she asked. 

He scooped her onto the kitchen 
counter and kissed her nose. “That’s a 
boxelder maple seed. Look at how it’s 
held within this thin layer, like a paper 
coat.” 

“How did it get here?” 

Her father told her that seeds travel by 
wind and water, in the hard shells of nuts 
and blankets of fruit, carried on the coats 
of animals or mashed within their diges- 
tive tracts, pulled underground by insects, 
buried by squirrels, scattered by the dual 
forces of pressure and gravity. 

Blackberry plants stretch their creeping 
vines, plunging spines into the earth. Coco- 
nuts embark on sea journeys, carrying the 
weight of meat and milk, to germinate on 
the sands of far beaches. Dandelion seeds 
dance in the air, and the boxelder maple 
encases its seeds in thin wings, to glide 
gently down. 


Hajar triggered the wings on the mech and 
leapt from a cliff, gliding in large loops until 
she touched down. The hands of the mech 
tested the air for breathability and tempera- 
ture, and searched for spores. 

Forward went the mech. Hajar read 
messages from the others. Nathaniel was 
heading north from a desert in the west- 
ern hemisphere, while Denisa had found 
a promising sector near the equator. Arwa 
had done a fascinating preliminary survey 
of insects ina grassland biome. No one had 
heard from Suraya, aside from a terse mes- 
sage that she had touched down and was 
exploring, but she tended to go quiet on 
bad days. 

They all carried their own secret griefs, 


Plotting a route. 


ready to bloom. Nathaniel refused to listen 
to Debussy, except for days when he would 
listen to nothing else, and the melody of 
‘Clair De Lune’ floated across the ship. 
Arwa had a singular teacup, light green 
and decorated with fish, which she hid 
away in her quarters. Every member of the 
crew had left someone behind. They car- 
ried their grief in different ways: in photo- 
graphs and letters, in knitted scarves and 
handwritten recipes, in ordinary objects 


such as teacups and brooms, chipped pots 
and blue stones. 

The mech marched onward under an alien 
sunset. When it grew too dark to see, Hajar 
halted the machine. She took the blue stone 
out of her pocket and turned it over. Before 
she left Earth, her father had folded the stone 
into her hands, his faded green hat tipped 
back, his hands smelling of cinnamon from 
baking. The stone had come from his garden, 
prised up from the mud. “Find a new place 
for it? hed said, planting those words in the 
space meant for goodbyes. This was what she 
brought from Earth, what she carried. 

Today, the stone felt rough against her fin- 
gers. She thought of her father, imagined the 
sound of the blender churning his breakfast, 
the worn leather of his hiking boots, his col- 
lection of rocks scattered over geology books 
on the kitchen counter. 

Eventually, she slept, the stone clutched in 

her hand. 
> NATURE.COM 
Follow Futures: 
Y @NatureFutures 
EE go.nature.com/mtoodm 


For her thesis, Hajar 
studied seed-dis- 
persal strategies of 
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Oenothera deltoides, the bird-cage plant of 
Californian deserts. 

The plant travels like this: as dunes shift, 
the roots are exposed. Shade melts away, 
leaving the plant under the light ofan intol- 
erable sun. The plant dies, curling its roots 
over itself: a bird cage, a wicker ball. Wind 
pulls the plant kilometres from its home. 
When the plant finds shelter from wind, 
seeds spill out from the lattice. 

New plants rise, phoenix-like, from the 
husk. 


The mech emerged in a clearing blanketed 
with grass. Three-petalled flowers bent in 
the wind. 

The soil readings were good. A river 
rushed by to the east, the water potable. 

Hajar sent a message to the others. 

A habitable zone to add to the list. She 

imagined all of them out in the meadow, 

tilling the soil, constructing houses from 
the durable bio-plastic theyd brought. 

Hajar emerged from the mech. For the 

first time, she felt the planet’s air on her 

face. The wind carried the smell of sun- 

simmered grass and wet soil. 

Next to her, a tree thick with seeds 
wrapped in flexible coating, like a boxelder 
maple. 

Once, her father had told her how seeds 
travel — to arid deserts and rich soil, 
through woodland and tundra, across 
oceans and rivers. By centimetres or kilo- 
metres, they go. 

Not all of them survive. 

Seeds travel, tumbling, falling, swept 
along until they cannot travel farther. 

Where they land is home. They put down 
roots, they grow. 

She wished her father was there. Seeds, 
she would tell him, are designed to travel, 
to seek out habitable spaces, leaving behind 
their progenitors, pushing forward into the 
wide future. 

Hajar slipped the blue stone from her 
pocket. This place felt right, already full of 
life. How many seeds were even now under 
the soil, waiting to grow? 

She clutched the stone close, then buried 
itin the soft earth. m 


Beth Goder works as an archivist, 
processing the papers of economists, 
scientists and other interesting folk. Her 
fiction has appeared in venues such as 
Escape Pod, Fireside and an anthology 
from Flame Tree Press. You can find her 
online at http://www.bethgoder.com. 
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